Re: [HACKERS] GSoC 2017: Foreign Key Arrays
On 11/10/2017 01:47 AM, Mark Rofail wrote: I am sorry for the late reply There is no reason for you to be. It did not take you 6 weeks to do a review. :) Thanks for this new version. == Functional review >1) MATCH FULL does not seem to care about NULLS in arrays. In the example below I expected both inserts into the referring table to fail. It seems in your example the only failed case was: INSERT INTO fk VALUES (NULL, '{1}'); which shouldn't work, can you clarify this? I think that if you use MATH FULL the query should fail if you have a NULL in the array. >2) To me it was not obvious that ON DELETE CASCADE would delete the whole rows rather than delete the members from the array, and this kind of misunderstanding can lead to pretty bad surprises in production. I am leaning towards not supporting CASCADE. I would say so too, maybe we should remove ON DELETE CASCADE until we have supported all remaining actions. I am leaning towards this too. I would personally be fine with a first version without support for CASCADE since it is not obvious to me what CASCADE should do. == The @>> operator I would argue that allocating an array of datums and building an array would have the same complexity I am not sure what you mean here. Just because something has the same complexity does not mean there can't be major performance differences. == Code review >I think the code in RI_Initial_Check() would be cleaner if you used "CROSS JOIN LATERAL unnest(col)" rather than having unnest() in the target list. This way you would not need to rename all columns and the code paths for the array case could look more like the code path for the normal case. Can you clarify what you mean a bit more? I think the code would look cleaner if you generate the following query: SELECT fk.x, fk.ys FROM ONLY t2 fk CROSS JOIN LATERAL pg_catalog.unnest(ys) a2 (v) LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.x AND pk.y = a2.v WHERE [...] rather than: SELECT fk.k1, fk.ak2 FROM (SELECT x k1, pg_catalog.unnest(ys) k2, ys ak2 FROM ONLY t2) fk LEFT OUTER JOIN ONLY t1 pk ON pk.x = fk.k1 AND pk.y = fk.k2 WHERE [...] = New stuff When applying the patch I got some white space warnings: Array-ELEMENT-foreign-key-v5.3.patch:1343: space before tab in indent, indent with spaces. format_type_be(oprleft), format_type_be(oprright; Array-ELEMENT-foreign-key-v5.3.patch:1345: trailing whitespace. When compiling I got an error: ri_triggers.c: In function ‘ri_GenerateQual’: ri_triggers.c:2693:19: error: unknown type name ‘d’ Oid oprcommon;d ^ ri_triggers.c:2700:3: error: conflicting types for ‘oprright’ oprright = get_array_type(operform->oprleft); ^~~~ ri_triggers.c:2691:9: note: previous declaration of ‘oprright’ was here Oid oprright; ^~~~ : recipe for target 'ri_triggers.o' failed When building the documentation I got two warnings: /usr/bin/osx:catalogs.sgml:2349:17:W: empty end-tag /usr/bin/osx:catalogs.sgml:2350:17:W: empty end-tag When running the tests I got a failure in element_foreign_key. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GnuTLS support
On 09/18/2017 07:04 PM, Jeff Janes wrote:> You fixed the first issue, but I still get the second one: be-secure-gnutls.c: In function 'get_peer_certificate': be-secure-gnutls.c:667: error: 'GNUTLS_X509_CRT_LIST_SORT' undeclared (first use in this function) be-secure-gnutls.c:667: error: (Each undeclared identifier is reported only once be-secure-gnutls.c:667: error: for each function it appears in.) Thanks again for testing the code. I have now rebased the patch and fixed the second issue. I tested that it works on CentOS 6. Work which remains: - sslinfo - pgcrypto - Documentation - Decide if what I did with the config is a good idea Andreas diff --git a/configure b/configure index 4ecd2e1922..1ba34dfced 100755 --- a/configure +++ b/configure @@ -709,6 +709,7 @@ UUID_EXTRA_OBJS with_uuid with_systemd with_selinux +with_gnutls with_openssl krb_srvtab with_python @@ -837,6 +838,7 @@ with_bsd_auth with_ldap with_bonjour with_openssl +with_gnutls with_selinux with_systemd with_readline @@ -1531,6 +1533,7 @@ Optional Packages: --with-ldap build with LDAP support --with-bonjour build with Bonjour support --with-openssl build with OpenSSL support + --with-gnutls build with GnuTS support --with-selinux build with SELinux support --with-systemd build with systemd support --without-readline do not use GNU Readline nor BSD Libedit for editing @@ -5997,6 +6000,41 @@ fi $as_echo "$with_openssl" >&6; } +# +# GnuTLS +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5 +$as_echo_n "checking whether to build with GnuTLS support... " >&6; } + + + +# Check whether --with-gnutls was given. +if test "${with_gnutls+set}" = set; then : + withval=$with_gnutls; + case $withval in +yes) + +$as_echo "#define USE_GNUTLS 1" >>confdefs.h + + ;; +no) + : + ;; +*) + as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5 + ;; + esac + +else + with_gnutls=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5 +$as_echo "$with_gnutls" >&6; } + + # # SELinux # @@ -10164,6 +10202,94 @@ done fi +if test "$with_gnutls" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5 +$as_echo_n "checking for library containing gnutls_init... " >&6; } +if ${ac_cv_search_gnutls_init+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char gnutls_init (); +int +main () +{ +return gnutls_init (); + ; + return 0; +} +_ACEOF +for ac_lib in '' gnutls; do + if test -z "$ac_lib"; then +ac_res="none required" + else +ac_res=-l$ac_lib +LIBS="-l$ac_lib $ac_func_search_save_LIBS" + fi + if ac_fn_c_try_link "$LINENO"; then : + ac_cv_search_gnutls_init=$ac_res +fi +rm -f core conftest.err conftest.$ac_objext \ +conftest$ac_exeext + if ${ac_cv_search_gnutls_init+:} false; then : + break +fi +done +if ${ac_cv_search_gnutls_init+:} false; then : + +else + ac_cv_search_gnutls_init=no +fi +rm conftest.$ac_ext +LIBS=$ac_func_search_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5 +$as_echo "$ac_cv_search_gnutls_init" >&6; } +ac_res=$ac_cv_search_gnutls_init +if test "$ac_res" != no; then : + test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" + +else + as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5 +fi + + # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted + # certificate chains. + ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include +#include + +" +if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then : + ac_have_decl=1 +else + ac_have_decl=0 +fi + +cat >>confdefs.h <<_ACEOF +#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl +_ACEOF + + for ac_func in gnutls_pkcs11_set_pin_function +do : + ac_fn_c_check_func "$LINENO" "gnutls_pkcs11_set_pin_function" "ac_cv_func_gnutls_pkcs11_set_pin_function" +if test "x$ac_cv_func_gnutls_pkcs11_set_pin_function" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_GNUTLS_PKCS11_SET_PIN_FUNCTION 1 +_ACEOF + +fi +done + +fi + if test "$with_pam" = yes ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5 $as_echo_n "checking for pam_start in -lpam... " >&6; } @@ -10961,6 +11087,17 @@ else fi +fi + +if test "$with_gnutls" = yes ; then + ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default" +if test
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Here is a rebased version of the patch. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index a0ca2851e5..f8c59ea127 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -926,6 +926,7 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, + REINDEX CONCURRENTLY, CREATE STATISTICS and ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 2e053c4c24..4019bad4c2 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } An index build with the CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be - convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + convenient to use REINDEX to rebuild them. @@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + VERBOSE @@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + + + +Rebuilding an index can interfere with regular operation of a database. +Normally PostgreSQL locks the table whose index is rebuilt +against writes and performs the entire index build with a single scan of the +table. Other transactions can still read the table, but if they try to +insert, update, or delete rows in the table they will block until the +index rebuild is finished. This could have a severe effect if the system is +a live production database. Very large tables can take many hours to be +indexed, and even for smaller tables, an index rebuild can lock out writers +for periods that are unacceptably long for a production system. + + + +PostgreSQL supports rebuilding indexes with minimum locking +of writes. This method is invoked by specifying the +CONCURRENTLY option of REINDEX. When this option +is used, PostgreSQL must perform two scans of the table +for each index that needs to be rebuild and in addition it must wait for +all existing transactions that could potentially use the index to +terminate. This method requires more total work than a standard index +rebuild and takes significantly longer to complete as it needs to wait +for unfinished transactions that might modify the index. However, since +it allows normal operations to continue while the index is rebuilt, this +method is useful for rebuilding indexes in a production environment. Of +course, the extra CPU, memory and I/O load imposed by the index rebuild +may slow down other operations. + + + +The following steps occur in a concurrent index build, each in a separate +transaction except when the new index definitions are created, where all +the concurrent entries are created using only one transaction. Note that +if there are multiple indexes to be rebuilt then each step loops through +all the indexes we're rebuilding, using a separate transaction for each one. +REINDEX CONCURRENTLY proceeds as follows when rebuilding +indexes: + + + + + A new temporary index definition is added into the catalog + pg_index. This definition will be used to replace the + old index. This step is done as a single transaction for all the indexes + involved in this process, meaning that if + REINDEX CONCURRENTLY is run on a table with multiple + indexes, all the catalog entries of the new indexes are created within a + single transaction. A SHARE UPDATE EXCLUSIVE lock at + session level is taken on the indexes being reindexed as well as its + parent table to prevent any schema modification while processing. + + + + + A first
Re: [HACKERS] git down
On 10/27/2017 10:51 PM, Erik Rijkers wrote: git.postgresql.org is down/unreachable ( git://git.postgresql.org/git/postgresql.git ) Yes, I noticed this too, but https://git.postgresql.org/git/postgresql.git still works fine. I guess it makes sense to remove unencrypted access, but in that case https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary should not advertise supporting the git protocol. I have not seen any announcement either, but that could just be me not paying enough attention. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Sorry for the very late review. I like this feature and have needed it myself in the past, and the current syntax seems pretty good. One could argue for if the syntax could be generalized to support other things like json and hstore, but I do not think it would be fair to block this patch due to that. == Limitations of the current design 1) Array element foreign keys can only be specified at the table level (not at columns): I think this limitation is fine. Other PostgreSQL specific features like exclusion contraints can also only be specified at the table level. 2) Lack of support for SET NULL and SET DEFAULT: these do not seem very useful for arrays. 3) Lack of support for specifiying multiple arrays in the foreign key: seems like a good thing to me since it is not obvious what such a thing even would do. 4) That you need to add a cast to the index if you have different types: due to there not being a int4[] <@ int2[] operator you need to add an index on (col::int4[]) to speed up deletes and updates. This one i annoying since EXPLAIN wont give you the query plans for the foreign key queries, but I do not think fixing this should be within the scope of the patch and that having a smaller interger in the referring table is rare. 5) The use of count(DISTINCT) requiring types to support btree equality: this has been discussed a lot up-thread and I think the current state is good enough. == Functional review I have played around some with it and things seem to work and the test suite passes, but I noticed a couple of strange behaviors. 1) MATCH FULL does not seem to care about NULLS in arrays. In the example below I expected both inserts into the referring table to fail. CREATE TABLE t (x int, y int, PRIMARY KEY (x, y)); CREATE TABLE fk (x int, ys int[], FOREIGN KEY (x, EACH ELEMENT OF ys) REFERENCES t MATCH FULL); INSERT INTO t VALUES (10, 1); INSERT INTO fk VALUES (10, '{1,NULL}'); INSERT INTO fk VALUES (NULL, '{1}'); CREATE TABLE CREATE TABLE INSERT 0 1 INSERT 0 1 ERROR: insert or update on table "fk" violates foreign key constraint "fk_x_fkey" DETAIL: MATCH FULL does not allow mixing of null and nonnull key values. 2) To me it was not obvious that ON DELETE CASCADE would delete the whole rows rather than delete the members from the array, and this kind of misunderstanding can lead to pretty bad surprises in production. I am leaning towards not supporting CASCADE. == The @>> operator A previous version of your patch added the "anyelement <<@ anyarray" operator to avoid having to build arrays, but that part was reverted due to a bug. I am not expert on the gin code, but as far as I can tell it would be relatively simple to fix that bug. Just allocate an array of Datums of length one where you put the element you are searching for (or maybe a copy of it). Potential issues with adding the operators: 1) Do we really want to add an operator just for array element foreign keys? I think this is not an issue since it seems like it should be useful in general. I know I have wanted it myself. 2) I am not sure, but the committers might prefer if adding the operators is done in a separate patch. 3) Bikeshedding about operator names. I personally think @>> is clear enough and as far as I know it is not used for anything else. == Code review The patch no longer applies to HEAD, but the conflicts are small. I think we should be more consistent in the naming, both in code and in the documentation. Right now we have "array foreign keys", "element foreign keys", "ELEMENT foreign keys", etc. + /* +* If this is an array foreign key, we must look up the operators for +* the array element type, not the array type itself. +*/ + if (fkreftypes[i] != FKCONSTR_REF_PLAIN) + if (fkreftypes[i] != FKCONSTR_REF_PLAIN) + { + old_fktype = get_base_element_type(old_fktype); + /* this shouldn't happen ... */ + if (!OidIsValid(old_fktype)) + elog(ERROR, "old foreign key column is not an array"); + } + if (riinfo->fk_reftypes[i] != FKCONSTR_REF_PLAIN) + { + riinfo->has_array = true; + riinfo->ff_eq_oprs[i] = ARRAY_EQ_OP; + } In the three diffs above it would be much cleaner to check for "== FKCONSTR_REF_EACH_ELEMENT" since that better conveys the intent and is safer for adding new types in the future. + /* We look through any domain here */ + fktype = get_base_element_type(fktype); What does the comment above mean? if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop))) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), -errmsg("foreign key constraint \"%s\" " - "cannot be implemented", - fkconstraint->conname), -errdetail("Key columns \"%s\" and
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/21/2017 10:55 PM, Peter Geoghegan wrote: I agree, but the bigger issue is that we're *half way* between supporting only one format, and supporting two formats. AFAICT, there is no reason that we can't simply support one format on all ICU versions, and keep what ends up within pg_collation at initdb time essentially the same across ICU versions (except for those that are due to cultural/political developments). I think we are in agreement then, but I do not have the time to get this done before the release of 10 so would be happy if you implemented it. Peter E, what do you say in this? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/21/2017 01:40 AM, Peter Geoghegan wrote: On Wed, Sep 20, 2017 at 4:08 PM, Peter Geogheganwrote: pg_import_system_collations() takes care to use the non-BCP-47 style for such versions, so I think this is working correctly. But CREATE COLLATION doesn't use pg_import_system_collations(). And perhaps more to the point: it highly confusing that we use one or the other of those 2 things ("langtag"/BCP 47 tag or "name"/legacy locale name) as "colcollate", depending on ICU version, thereby *behaving* as if ICU < 54 really didn't know anything about BCP 47 tags. Because, obviously earlier ICU versions know plenty about BCP 47, since 9 lines further down we use "langtag"/BCP 47 tag as collname for CollationCreate() -- regardless of ICU version. How can you say "ICU <54 doesn't even support the BCP 47 style", given all that? Those versions will still have locales named "*-x-icu" when users do "\dOS". Users will be highly confused when they quite reasonably try to generalize from the example in the docs and what "\dOS" shows, and get results that are wrong, often only in a very subtle way. If we are fine with supporting only ICU 4.2 and later (which I think we are given that ICU 4.2 was released in 2009) then using uloc_forLanguageTag()[1] to validate and canonize seems like the right solution. I had missed that this function even existed when I last read the documentation. Does it return a BCP 47 tag in modern versions of ICU? I strongly prefer if there, as much as possible, is only one format for inputting ICU locales. 1. http://www.icu-project.org/apiref/icu4c/uloc_8h.html#aa45d6457f72867880f079e27a63c6fcb Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/19/2017 11:32 PM, Peter Geoghegan wrote: On Tue, Sep 19, 2017 at 2:22 PM, Tom Lanewrote: Well, if PG10 shipped with that restriction in place then it wouldn't be an issue ;-) I was proposing that this be treated as an open item for v10; sorry if I was unclear on that. Much like the "ICU locales vs. ICU collations within pg_collation" issue, this seems like the kind of thing that we ought to go out of our way to get right in the *first* version. If people think it is possible to get this done in time for PostgreSQL 10 and it does not break anything on older version of ICU (or the migration from older versions) I am all for it. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
On 09/19/2017 12:46 AM, Peter Geoghegan wrote:> At one point a couple of months back, it was understood that get_icu_language_tag() might not always work with (assumed) valid locale names -- that is at least the impression that the commit message of eccead9 left me with. But, that was only with ICU 4.2, and in any case we've since stopped creating keyword variants at initdb time for other reasons (see 2bfd1b1 for details of those other reasons). I tend to think that we should not install any language tag that uloc_toLanguageTag() does not accept as valid on general principle (so not just at initdb time, when it's actually least needed). Thoughts? I can write a patch for this, if that helps. It should be straightforward. Hm, I like the idea but I see some issues. Enforcing the BCP47 seems like a good thing to me. I do not see any reason to allow input with syntax errors. The issue though is that we do not want to break people's databases when they upgrade to PostgreSQL 11. What if they have specified the locale in the old non-ICU format or they have a bogus value and we then error out on pg_upgrade or pg_restore? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC 2017: Foreign Key Arrays
I have not looked at the issue with the btree_gin tests yet, but here is the first part of my review. = Review This is my first quick review where I just read the documentation and quickly tested the feature. I will review it more in-depth later. This is a very useful feature, one which I have a long time wished for. The patch applies, compiles and passes the test suite with just one warning. parse_coerce.c: In function ‘select_common_type_2args’: parse_coerce.c:1379:7: warning: statement with no effect [-Wunused-value] rightOid; ^~~~ = Functional The documentation does not agree with the code on the syntax. The documentation claims it is "FOREIGN KEY (ELEMENT xs) REFERENCES t1 (x)" when it actually is "FOREIGN KEY (EACH ELEMENT OF xs) REFERENCES t1 (x)". Likewise I can't get the "final_positions integer[] ELEMENT REFERENCES drivers" syntax to work, but here I cannot see any change in the syntax to support it. Related to the above: I am not sure if it is a good idea to make ELEMENT a reserved word in column definitions. What if the SQL standard wants to use it for something? The documentation claims ON CASCADE DELETE is not supported by array element foreign keys, but I do not think that is actually the case. I think I prefer (EACH ELEMENT OF xs) over (ELEMENT xs) given how the former is more in what I feel is the spirit of SQL. And if so we should match it as "xs integer[] EACH ELEMENT REFERENCES t1 (x)", assuming we want that syntax. Once I have created an array element foreign key the basic features seem to work as expected. The error message below fails to mention that it is an array element foreign key, but I do not think that is not a blocker for getting this feature merged. Right now I cannot think of how to improve it either. $ INSERT INTO t3 VALUES ('{1,3}'); ERROR: insert or update on table "t3" violates foreign key constraint "t3_xs_fkey" DETAIL: Key (xs)=({1,3}) is not present in table "t1". = Nitpicking/style comments In doc/src/sgml/catalogs.sgml the "conpfeqop" line is incorrectly indented. I am not fan of calling it "array-vs-scalar". What about array to scalar? In ddl.sgml date should be lower case like the other types in "race_day DATE,". In ddl.sgml I suggest removing the "..." from the examples to make it possible to copy paste them easily. Your text wrapping in ddl.sqml and create_table.sgqml is quite arbitrary. I suggest wrapping all paragraphs at 80 characters (except for code which should not be wrapped). Your text editor probably has tools for wrapping paragraphs. Please be consistent about how you write table names and SQL in general. I think almost all places use lower case for table names, while your examples in create_table.sgml are FKTABLEFORARRAY. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GnuTLS support
On 09/15/2017 06:55 PM, Jeff Janes wrote: I can't build against gnutls-2.12.23-21.el6.x86_64 from CentOS 6.9 Thanks for testing my patch. I have fixed both these issues plus some of the other feedback. A new version of my patch is attached which should, at least on theory, support all GnuTLS versions >= 2.11. I just very quickly fixed the broken SSL tests, as I am no fan of how the SSL tests currently are written and think they should be cleaned up. Andreas diff --git a/configure b/configure index 0d76e5ea42..33b1f00bff 100755 --- a/configure +++ b/configure @@ -709,6 +709,7 @@ UUID_EXTRA_OBJS with_uuid with_systemd with_selinux +with_gnutls with_openssl krb_srvtab with_python @@ -838,6 +839,7 @@ with_bsd_auth with_ldap with_bonjour with_openssl +with_gnutls with_selinux with_systemd with_readline @@ -1534,6 +1536,7 @@ Optional Packages: --with-ldap build with LDAP support --with-bonjour build with Bonjour support --with-openssl build with OpenSSL support + --with-gnutls build with GnuTS support --with-selinux build with SELinux support --with-systemd build with systemd support --without-readline do not use GNU Readline nor BSD Libedit for editing @@ -6051,6 +6054,41 @@ fi $as_echo "$with_openssl" >&6; } +# +# GnuTLS +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5 +$as_echo_n "checking whether to build with GnuTLS support... " >&6; } + + + +# Check whether --with-gnutls was given. +if test "${with_gnutls+set}" = set; then : + withval=$with_gnutls; + case $withval in +yes) + +$as_echo "#define USE_GNUTLS 1" >>confdefs.h + + ;; +no) + : + ;; +*) + as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5 + ;; + esac + +else + with_gnutls=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5 +$as_echo "$with_gnutls" >&6; } + + # # SELinux # @@ -10218,6 +10256,83 @@ done fi +if test "$with_gnutls" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5 +$as_echo_n "checking for library containing gnutls_init... " >&6; } +if ${ac_cv_search_gnutls_init+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char gnutls_init (); +int +main () +{ +return gnutls_init (); + ; + return 0; +} +_ACEOF +for ac_lib in '' gnutls; do + if test -z "$ac_lib"; then +ac_res="none required" + else +ac_res=-l$ac_lib +LIBS="-l$ac_lib $ac_func_search_save_LIBS" + fi + if ac_fn_c_try_link "$LINENO"; then : + ac_cv_search_gnutls_init=$ac_res +fi +rm -f core conftest.err conftest.$ac_objext \ +conftest$ac_exeext + if ${ac_cv_search_gnutls_init+:} false; then : + break +fi +done +if ${ac_cv_search_gnutls_init+:} false; then : + +else + ac_cv_search_gnutls_init=no +fi +rm conftest.$ac_ext +LIBS=$ac_func_search_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5 +$as_echo "$ac_cv_search_gnutls_init" >&6; } +ac_res=$ac_cv_search_gnutls_init +if test "$ac_res" != no; then : + test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" + +else + as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5 +fi + + # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted + # certificate chains. + ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include +#include + +" +if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then : + ac_have_decl=1 +else + ac_have_decl=0 +fi + +cat >>confdefs.h <<_ACEOF +#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl +_ACEOF + +fi + if test "$with_pam" = yes ; then { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5 $as_echo_n "checking for pam_start in -lpam... " >&6; } @@ -11015,6 +11130,17 @@ else fi +fi + +if test "$with_gnutls" = yes ; then + ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default" +if test "x$ac_cv_header_gnutls_gnutls_h" = xyes; then : + +else + as_fn_error $? "header file is required for GnuTLS" "$LINENO" 5 +fi + + fi if test "$with_pam" = yes ; then @@ -15522,9 +15648,11 @@ fi # in the template or configure command line. # If not selected manually, try to select a source automatically. -if test "$enable_strong_random" = "yes" && test x"$USE_OPENSSL_RANDOM" = x"" && test x"$USE_WIN32_RANDOM" = x"" && test x"$USE_DEV_URANDOM" = x"" ; then +if test "$enable_strong_random" = "yes" && test
Re: [HACKERS] postgres_fdw super user checks
On 09/14/2017 08:33 PM, Jeff Janes wrote:> Attached is a new patch which fixes the style issue you mentioned. Thanks, the patch looks good no,w and as far as I can tell there was no need to update the comments or the documentation so I am setting this as ready for committer. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] generated columns
On 09/13/2017 04:04 AM, Simon Riggs wrote: On 31 August 2017 at 05:16, Peter Eisentrautwrote: - index support (and related constraint support) Presumably you can't index a VIRTUAL column. Or at least I don't think its worth spending time trying to make it work. I think end users would be surprised if one can index STORED columns and expressions but not VIRTUAL columns. So unless it is a huge project I would say it is worth it. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patches that don't apply or don't compile: 2017-09-12
On 09/12/2017 04:14 PM, Aleksander Alekseev wrote: Title: Foreign Key Arrays Author: Mark RofailURL:https://commitfest.postgresql.org/14/1252/ I am currently reviewing this one and it applies, compiles, and passes the test suite. It could be the compilation warnings which makes the system think it failed, but I could not find the log of the failed build. We want to be welcoming to new contributors so until we have a reliable CI server which can provide easy to read build logs I am against changing the status of any patches solely based on the result of the CI server. I think it should be used as a complimentary tool until the community deems it to be good enough. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] postgres_fdw super user checks
On 07/27/2017 09:45 PM, Jeff Janes wrote:> Here is an updated patch. This version allows you use the password-less connection if you either are the super-user directly (which is the existing committed behavior), or if you are using the super-user's mapping because you are querying a super-user-owned view which you have been granted access to. I have tested the patch and it passes the tests and works, and the code looks good (I have a small nitpick below). The feature seems useful, especially for people who already use views for security, so the question is if this is a potential footgun. I am leaning towards no since the superuser should be careful when grant access to is views anyway. It would have been nice if there was a more generic way to handle this since 1) the security issue is not unique to postgres_fdw and 2) this requires you to create a view. But since the patch is simple, an improvement in itself and does not prevent any future further improvements in this era I see no reason to let perfect be the enemy of good. = Nitpicking/style I would prefer if /* no check required if superuser */ if (superuser()) return; if (superuser_arg(user->userid)) return; was, for consistency with the if clause in connect_pg_server(), written as /* no check required if superuser */ if (superuser() || superuser_arg(user->userid)) return; Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GnuTLS support
On 09/07/2017 11:34 PM, Tomas Vondra wrote: I am worried about having 3x version of TLS controls in postgresql.conf, and only one set being active. Perhaps we need to break out the TLS config to separate files or something. Anyway, this needs more thought. Well, people won't be able to set the inactive options, just like you can't set ssl=on when you build without OpenSSL support. But perhaps we could simply not include the inactive options into the config file, no? Yeah, I have been thinking about how bad it would be to dynamically generate the config file. I think I will try this. Daniel: What options does Secure Transport need for configuring ciphers, ECDH, and cipher preference? Does it need any extra options (I think I saw something about the keychain)? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GnuTLS support
Hi, I have seen discussions from time to time about OpenSSL and its licensing issues so I decided to see how much work it would be to add support for another TLS library, and I went with GnuTLS since it is the library I know best after OpenSSL and it is also a reasonably popular library. Attached is a work in progress patch which implements the basics. I send it the list because I want feedback on some design questions and to check with the community if this is a feature we want. = What is implemented - Backend - Frontend - Diffie-Hellmann support - Using GnuTLS for secure random numbers - Using GnuTLS for sha2 = Work left to do - Add GnuTLS support to sslinfo - Add GnuTLS support to pgcrypto - Support for GnuTLS's version of engines - Test code with older versions of GnuTLS - Update documentation - Add support for all postgresql.conf options (see design question) - Fix two failing tests (see design question) = Design questions == GnuTLS priority strings vs OpenSSL cipher lists GnuTLS uses a different format for specifying ciphers. Instead of setting ciphers, protocol versions, and ECDH curves separately GnuTLS instead uses a single priority string[1]. For example the default settings of PostgreSQL (which include disabling SSLv3) ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers ssl_prefer_server_ciphers = on ssl_ecdh_curve = 'prime256v1' is represented with a string similar to SECURE128:+3DES-CBC:+GROUP-SECP256R1:%SERVER_PRECEDENCE So the two libraries have decided on different ways to specify things. One way to solve th issue would be to just let ssl_ciphers be the priority string and then add %SERVER_PRECEDENCE to it if ssl_prefer_server_ciphers is on. Adding the ssl_ecdh_curve is trickier since the curves have different names in GnuTLS (e.g. prime256v1 vs SECP256R1) and I would rather avoid having to add such a mapping to PostgreSQL. Thoughts? == Potentially OpenSSL-specific est cases There are currently two failing SSL tests which at least to me seems more like they test specific OpenSSL behaviors rather than something which need to be true for all SSL libraries. The two tests: # Try with just the server CA's cert. This fails because the root file # must contain the whole chain up to the root CA. note "connect with server CA cert, without root CA"; test_connect_fails("sslrootcert=ssl/server_ca.crt sslmode=verify-ca"); # A CRL belonging to a different CA is not accepted, fails test_connect_fails( "sslrootcert=ssl/root+server_ca.crt sslmode=verify-ca sslcrl=ssl/client.crl"); For the missing root CA case GnuTLS seems to be happy enough with just an intermediate CA and as far as I can tell this behavior is not easy to configure. And for the CRL belonging to a different CA case GnuTLS can be configured to either just store such a non-validating CRL or to ignore it, but not to return an error. Personally I think thee two tests should just be removed but maybe I am missing something. Notes: 1. https://gnutls.org/manual/html_node/Priority-Strings.html Andreas diff --git a/configure b/configure index a2f9a256b4..8dcb26b532 100755 --- a/configure +++ b/configure @@ -709,6 +709,7 @@ UUID_EXTRA_OBJS with_uuid with_systemd with_selinux +with_gnutls with_openssl krb_srvtab with_python @@ -838,6 +839,7 @@ with_bsd_auth with_ldap with_bonjour with_openssl +with_gnutls with_selinux with_systemd with_readline @@ -1534,6 +1536,7 @@ Optional Packages: --with-ldap build with LDAP support --with-bonjour build with Bonjour support --with-openssl build with OpenSSL support + --with-gnutls build with GnuTS support --with-selinux build with SELinux support --with-systemd build with systemd support --without-readline do not use GNU Readline nor BSD Libedit for editing @@ -6051,6 +6054,41 @@ fi $as_echo "$with_openssl" >&6; } +# +# GnuTLS +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5 +$as_echo_n "checking whether to build with GnuTLS support... " >&6; } + + + +# Check whether --with-gnutls was given. +if test "${with_gnutls+set}" = set; then : + withval=$with_gnutls; + case $withval in +yes) + +$as_echo "#define USE_GNUTLS 1" >>confdefs.h + + ;; +no) + : + ;; +*) + as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5 + ;; + esac + +else + with_gnutls=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5 +$as_echo "$with_gnutls" >&6; } + + # # SELinux # @@ -10218,6 +10256,67 @@ done fi +if test "$with_gnutls" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5 +$as_echo_n "checking for library containing gnutls_init... " >&6; } +if ${ac_cv_search_gnutls_init+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_func_search_save_LIBS=$LIBS +cat confdefs.h -
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
I have attached a new, rebased version of the batch with most of Banck's and some of your feedback incorporated. Thanks for the good feedback! On 03/31/2017 08:27 AM, Michael Paquier wrote> When running REINDEX SCHEMA CONCURRENTLY public on the regression database I am bumping into a bunch of these warnings: WARNING: 01000: snapshot 0x7fa5e640 still active LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 WARNING: 01000: snapshot 0x7fa5e640 still active LOCATION: AtEOXact_Snapshot, snapmgr.c:1123 I failed to reproduce this. Do you have a reproducible test case? + * Reset attcacheoff for a TupleDesc + */ +void +ResetTupleDescCache(TupleDesc tupdesc) +{ + int i; + + for (i = 0; i < tupdesc->natts; i++) + tupdesc->attrs[i]->attcacheoff = -1; +} I think that it would be better to merge that with TupleDescInitEntry to be sure that the initialization of a TupleDesc's attribute goes through only one code path. Sorry, but I am not sure I understand your suggestion. I do not like the ResetTupleDescCache function so all suggestions are welcome. -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name I am taking the war path with such a sentence... But what about adding CONCURRENTLY to the list of options in parenthesis instead? I have thought some about this myself and I do not care strongly either way. - Explore the use of SQL-level interfaces to mark an index as inactive at creation. - Remove work done in changeDependencyForAll, and replace it by something similar to what tablecmds.c does. There is I think here some place for refactoring if that's not with CREATE TABLE LIKE. This requires to the same work of creation, renaming and drop of the old triggers and constraints. I am no fan of the current code duplication and how fragile it is, but I think these cases are sufficiently different to prevent meaningful code reuse. But it could just be me who is unfamiliar with that part of the code. - Do a per-index rebuild and not a per-relation rebuild for concurrent indexing. Doing a per-relation reindex has the disadvantage that many objects need to be created at the same time, and in the case of REINDEX CONCURRENTLY time of the operation is not what matters, it is how intrusive the operation is. Relations with many indexes would also result in much object locks taken at each step. I am still leaning towards my current tradeoff since waiting for all queries to stop using an index can take a lot of time and if you only have to do that once per table it would be a huge benefit under some workloads, and you can still reindex each index separately if you need to. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index dda0170886..c97944b2c9 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -926,7 +926,7 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), ANALYZE, CREATE INDEX CONCURRENTLY, - CREATE STATISTICS and + REINDEX CONCURRENTLY, CREATE STATISTICS and ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ). diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 3908ade37b..4ef3a89a29 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -67,10 +67,7 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } An index build with the CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be - convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + convenient to use REINDEX to rebuild them. @@ -151,6 +148,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + VERBOSE @@ -231,6 +243,173 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + +
Re: [HACKERS] PostgreSQL not setting OpenSSL session id context?
On 08/04/2017 08:48 PM, Shay Rojansky wrote: On 2017-08-04 07:22:42 +0300, Shay Rojansky wrote: > I'm still not convinced of the risk/problem of simply setting the session > id context as I explained above (rather than disabling the optimization), > but of course either solution resolves my problem. How would that do anything? Each backend has it's own local memory. I.e. any cache state that openssl would maintain wouldn't be useful. If you want to take advantage of features around this you really need to cache tickets in shared memory... Guys, there's no data being cached at the backend - RFC5077 is about packaging information into a client-side opaque session ticket that allows skipping a roundtrip on the next connection. As I said, simply setting the session id context (*not* the session id or anything else) makes this feature work, even though a completely new backend process is launched. Yes, session tickets are encrypted data which is stored by the client. But if we are going to support them I think we should do it properly with new GUCs for the key file and disabling the feature. Using a key file is less necessary for PostgreSQL than for a web server since it is less common to do round robin load balancing between different PostgreSQL instances. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oidin / oidout and InvalidOid
On 06/12/2017 01:41 PM, Chapman Flack wrote: I was recently guilty of writing a less-than-clear SQL example because I plain forgot whether InvalidOid was 0 (correct) or -1 (my bad). Would there be any sense in letting oidin_subr accept the string InvalidOid for 0? I understand that changing oidout could break existing code outside of the tree. But what if oidout were to be conservative in what it does, and oidin liberal in what it accepts? I am not sure I am a fan of this, but if we should have an alias for InvalidOid how about reusing '-' which is used by the reg*in functions? Or is that too non-obvious? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/04/2017 06:22 PM, Andrew Dunstan wrote: I wrote this query: select (json_populate_record(null::mytype, myjson)).* from mytable; It turned out that this was an order of magnitude faster: with r as ( select json_populate_record(null::mytype, myjson) as x from mytable ) select (x).* from r; I do not know the planner that well, but I imagined that when we remove the optimization fence that one would be evaluated similar to if it had been a lateral join, i.e. there would be no extra function calls in this case after removing the fence. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/03/2017 07:33 PM, Alvaro Herrera wrote: 1) we switch unmarked CTEs as inlineable by default in pg11. What seems likely to happen for a user that upgrades to pg11 is that 5 out of 10 CTE-using queries are going to become faster than with pg10, and they are going to be happy; 4 out of five are going to see no difference, but they didn't have to do anything about it; and the remaining query is going to become slower, either indistinguishably so (in which case they don't care and they remain happy because of the other improvements) or notably so, in which case they can easily figure where to add the MATERIALIZED option and regain the original performance. 2) unmarked CTEs continue to be an optimization barrier, but we add "WITH INLINED" so that they're inlineable. Some users may wonder about it and waste a lot of time trying to figure out which CTEs to add it to. They see a benefit in half the queries, which makes them happy, but they are angry that they had to waste all that time on the other queries. 3) We don't do anything, because we all agree that GUCs are not suitable. No progress. No anger, but nobody is happy either. +1 for option 1. And while I would not like if we had to combine it with a backwards compatibility GUC which enables the old behavior to get it merged I still personally would prefer that over option 2 and 3. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/02/2017 04:38 AM, Craig Ringer wrote: On 1 May 2017 at 22:26, Andreas Karlsson <andr...@proxel.se> wrote: I am not sure I like decorators since this means adding an ad hoc query hint directly into the SQL syntax which is something which I requires serious consideration. And mangling the semantics of existing syntax doesn't? That's what we do right now so we can pretend we don't have query hints while still having query hints. I am in favor of removing the optimization fence from CTEs, and strongly prefer no fence being the default behavior since SQL is a declarative language and I think it is reasonable to assume that CTEs can be inlined. But the question is how to best remove the fence while taking into account that quite many use them as optimization fences today. I see some alternatives, none of them perfect. 1. Just remove the optimization fence and let people add OFFSET 0 to their queries if they want an optimization fence. This lets us keep pretending that we do not have query hints (and therefore do not have to formalize any syntax for them) while still allowing people to add optimization fences. 2. Add a decorator for WITH (e.g. "WITH MATERIALIZED x (...") to add an explicit optimization fence. This will for the first time add official support for a query hint in the syntax which is a quite big precedent. 3. Add a new GUC which can enable and disable the optimization fence. This is a very clumsy tool, but maybe good enough for some users and some people here in this thread have complained about our similar GUCs. 4. Add some new more generic query hinting facility. This is a lot of work and something which would be very hard to get consensus for. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/01/2017 04:33 PM, David G. Johnston wrote: > On Mon, May 1, 2017 at 7:26 AM, Andreas Karlsson <andr...@proxel.se > I am not sure I like decorators since this means adding an ad hoc > query hint directly into the SQL syntax which is something which I > requires serious consideration. > > Given that we already have > " > prevent optimization > " > syntax why do we need a decorator on the CTE? I do not think I follow. Me and some other people here would ideally allow CTEs to be inlined by default. Some people today use CTEs as optimization fences, to for example control join order, and the suggestion here is to add new syntax for CTEs to allow them to selectively be used as optimization fences. > I would shorten that to "WITH MAT" except that I don't think that > having two way to introduce an optimization fence is worthwhile. You mean OFFSET 0? I have never been a fan of using it as an optimization fence. I do not think OFFSET 0 conveys clearly enough to the reader that is is an optimization fence. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CTE inlining
On 05/01/2017 04:17 PM, David Fetter wrote: Maybe we could allow a "decorator" that would tell the planner the CTE could be inlined? WITH INLINE mycte AS ( ...) +1 for a decorator, -1 for this one. I am not sure I like decorators since this means adding an ad hoc query hint directly into the SQL syntax which is something which I requires serious consideration. We already have an explicit optimization fence with OFFSET 0, and I think making optimization fences explicit is how we should continue. I'd be more in favor of something along the lines of WITH FENCED/* Somewhat fuzzy. What fence? */ or WITH AT_MOST_ONCE /* Clearer, but not super precise */ or WITH UNIQUE_ATOMIC /* More descriptive, but not super clear without the docs in hand */ or something along that line. What about WITH MATERIALIZED, borrowing from the MySQL terminology "materialized subquery"? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG 10 release notes
On 04/25/2017 03:31 AM, Bruce Momjian wrote: I have committed the first draft of the Postgres 10 release notes. They are current as of two days ago, and I will keep them current. Please give me any feedback you have. This item is incorrectly attributed to me. I was only the reviewer, Peter is the author. + + + +Create a linkend="catalog-pg-sequence">pg_sequence system catalog to store sequence metadata (Andreas +Karlsson) + Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Self-signed certificate instructions
On 04/15/2017 03:58 PM, Andrew Dunstan wrote: The instructions on how to create a self-signed certificate in s 18.9.3 of the docs seem unduly cumbersome. +1, I see no reason for us to spread unnecessarily complicated instructions. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 04/16/2017 03:14 AM, Tom Lane wrote: 1. Back-patch that patch, probably also including the followup adjustments in 86029b31e and 36a3be654. 2. Add #if's to use 31cf1a1a4's coding with OpenSSL >= 1.1, while keeping the older code for use when built against older OpenSSLs. 3. Conditionally disable renegotiation altogether with OpenSSL >= 1.1, thus adopting 9.5 not 9.4 behavior when using newer OpenSSL. [...] Thoughts? Given that I cannot recall seeing any complaints about the behavior of 9.4 compared to 9.3 I am leaning towards #1. That way there are fewer different versions of our OpenSSL code. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/14/2017 11:54 PM, Tom Lane wrote: I failed to resist the temptation to poke at this, and found that indeed nothing seems to break if we just use one transaction for the whole processing of postgres.bki. So I've pushed a patch that does that. We're definitely down to the point where worrying about the speed of bootstrap mode, per se, is useless; the other steps in initdb visibly take a lot more time. Looked some at this and what take time now for me seems to mainly be these four things (out of a total runtime of 560 ms). 1. setup_conversion:140 ms 2. select_default_timezone: 90 ms 3. bootstrap_template1: 80 ms 4. setup_schema: 65 ms These four take up about two thirds of the total runtime, so it seems likely that we may still have relatively low hanging fruit (but not worth committing for PostgreSQL 10). I have not done profiling of these functions yet, so am not sure how they best would be fixed but maybe setup_conversion could be converted into bki entries to speed it up. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/13/2017 06:13 PM, Tom Lane wrote: I've pushed this with some mostly-cosmetic adjustments: Thanks! I like your adjustments. There's certainly lots more that could be done in the genbki code, but I think all we can justify at this stage of the development cycle is to get the low-hanging fruit for testing speedups. Yeah, I also noticed that the genbki code seems to have gotten little love and that much more can be done here. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/12/2017 05:00 PM, Andreas Karlsson wrote: Looked at this an option 1 seems simple enough if I am not missing something. I might hack something up later tonight. Either way I think this improvement can be done separately from the proposed replacement of the catalog header files. Trying to fix everything at once often leads to nothing being fixed at all. Here is my proof of concept patch. It does basically the same thing as Andres's patch except that it handles quoted values a bit better and does not try to support anything other than the regproc type. The patch speeds up initdb without fsync from 0.80 seconds to 0.55 seconds, which is a nice speedup, while adding a negligible amount of extra work on compilation. Two things which could be improved in this patch if people deem it important: - Refactor the code to be more generic, like Andres patch, so it is easier to add lookups for other data types. - Write something closer to a real .bki parser to actually understand quoted values and _null_ when parsing the data. Right now this patch only splits each row into its fields (while being aware of quotes) but does not attempt to remove quotes. The PoC patch treats "foo" and foo as different. Andreas commit bc8ede661b12407c0b6a9e4c0932d21028f87fc1 Author: Andreas Karlsson <andr...@proxel.se> Date: Wed Apr 12 21:00:49 2017 +0200 WIP: Resolve regproc entires when generating .bki diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 6e9d57aa8d..f918c9ef8a 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -102,6 +102,7 @@ print $bki "# PostgreSQL $major_version\n"; # vars to hold data needed for schemapg.h my %schemapg_entries; my @tables_needing_macros; +my %procs; our @types; # produce output, one catalog at a time @@ -164,20 +165,41 @@ foreach my $catname (@{ $catalogs->{names} }) $row->{bki_values} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; $row->{bki_values} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; + # Split values into tokens without interpreting their meaning. + my %bki_values; + @bki_values{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g; + + # Substitute regproc entires with oids + foreach my $att (keys %bki_values) + { +next if $bki_attr{$att}->{type} ne 'regproc'; +next if $bki_values{$att} =~ /\A(\d+|-|_null_)\z/; + +$bki_values{$att} = $procs{$bki_values{$att}}; + } + + # Save pg_proc oids for use by later catalogs. This relies on the + # order we process the files, but the bootstrap code also relies on + # pg_proc being loaded first. + if ($catname eq 'pg_proc') + { +$procs{$bki_values{proname}} = $row->{oid}; + } + # Save pg_type info for pg_attribute processing below if ($catname eq 'pg_type') { -my %type; +my %type = %bki_values; $type{oid} = $row->{oid}; -@type{@attnames} = split /\s+/, $row->{bki_values}; push @types, \%type; } # Write to postgres.bki my $oid = $row->{oid} ? "OID = $row->{oid} " : ''; - printf $bki "insert %s( %s)\n", $oid, $row->{bki_values}; + printf $bki "insert %s( %s)\n", $oid, + join(' ', @bki_values{@attnames}); - # Write comments to postgres.description and postgres.shdescription + # Write comments to postgres.description and postgres.shdescription if (defined $row->{descr}) { printf $descr "%s\t%s\t0\t%s\n", $row->{oid}, $catname, diff --git a/src/backend/utils/Gen_fmgrtab.pl b/src/backend/utils/Gen_fmgrtab.pl index 2af9b355e7..10d9c6abc7 100644 --- a/src/backend/utils/Gen_fmgrtab.pl +++ b/src/backend/utils/Gen_fmgrtab.pl @@ -58,16 +58,9 @@ foreach my $column (@{ $catalogs->{pg_proc}->{columns} }) my $data = $catalogs->{pg_proc}->{data}; foreach my $row (@$data) { - - # To construct fmgroids.h and fmgrtab.c, we need to inspect some - # of the individual data fields. Just splitting on whitespace - # won't work, because some quoted fields might contain internal - # whitespace. We handle this by folding them all to a simple - # "xxx". Fortunately, this script doesn't need to look at any - # fields that might need quoting, so this simple hack is - # sufficient. - $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; - @{$row}{@attnames} = split /\s+/, $row->{bki_values}; + # Split values into tokens without interpreting their meaning. + # This is safe becease we are going to use the values as-is. + @{$row}{@attnames} = $row->{bki_values} =~ /"[^"]*"|\S+/g; # Select out just the rows for internal-language procedures. # Note assumption here that INTERNALlanguageId is 12. diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c index 702924a958..a4237b0661 100644 --- a/src/backend/utils/adt/regproc.c +++ b/src/backend/utils/adt/regproc.c @@
Re: [HACKERS] Cutting initdb's runtime (Perl question embedded)
On 04/12/2017 04:12 PM, Tom Lane wrote: 1. The best thing would still be to make genbki.pl do the conversion, and write numeric OIDs into postgres.bki. The core stumbling block here seems to be that for most catalogs, Catalog.pm and genbki.pl never really break down a DATA line into fields --- and we certainly have got to do that, if we're going to replace the values of regproc fields. The places that do need to do that approximate it like this: # To construct fmgroids.h and fmgrtab.c, we need to inspect some # of the individual data fields. Just splitting on whitespace # won't work, because some quoted fields might contain internal # whitespace. We handle this by folding them all to a simple # "xxx". Fortunately, this script doesn't need to look at any # fields that might need quoting, so this simple hack is # sufficient. $row->{bki_values} =~ s/"[^"]*"/"xxx"/g; @{$row}{@attnames} = split /\s+/, $row->{bki_values}; We would need a bullet-proof, non-hack, preferably not too slow way to split DATA lines into fields properly. I'm one of the world's worst Perl programmers, but surely there's a way? 2. Alternatively, we could teach bootstrap mode to build a hashtable mapping proname to OID while it reads pg_proc data from postgres.bki, and then make regprocin's bootstrap path consult the hashtable instead of looking directly at the pg_proc file. That I'm quite sure is do-able, but it seems like it's leaving money on the table compared to doing the transformation earlier. Thoughts? Looked at this an option 1 seems simple enough if I am not missing something. I might hack something up later tonight. Either way I think this improvement can be done separately from the proposed replacement of the catalog header files. Trying to fix everything at once often leads to nothing being fixed at all. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 04/03/2017 07:57 AM, Michael Paquier wrote: On Fri, Mar 31, 2017 at 5:12 PM, Andreas Karlsson <andr...@proxel.se> wrote: On 03/31/2017 08:27 AM, Michael Paquier wrote: - Do a per-index rebuild and not a per-relation rebuild for concurrent indexing. Doing a per-relation reindex has the disadvantage that many objects need to be created at the same time, and in the case of REINDEX CONCURRENTLY time of the operation is not what matters, it is how intrusive the operation is. Relations with many indexes would also result in much object locks taken at each step. I am personally worried about the amount time spent waiting for long running transactions if you reindex per index rather than per relation. Because when you for one index wait on long running transactions nothing prevents new long transaction from starting, which we will have to wait for while reindexing the next index. If your database has many long running transactions more time will be spent waiting than the time spent working. Yup, I am not saying that one approach or the other are bad, both are worth considering. That's a deal between waiting and manual potential cleanup in the event of a failure. Agreed, and which is worse probably depends heavily on your schema and workload. I am marking this patch as returned with feedback, this won't get in PG10. If I am freed from the SCRAM-related open items I'll try to give another shot at implementing this feature before the first CF of PG11. Thanks! I also think I will have time to work on this before the first CF. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Thanks for the feedback. I will look at it when I get the time. On 03/31/2017 08:27 AM, Michael Paquier wrote: - Do a per-index rebuild and not a per-relation rebuild for concurrent indexing. Doing a per-relation reindex has the disadvantage that many objects need to be created at the same time, and in the case of REINDEX CONCURRENTLY time of the operation is not what matters, it is how intrusive the operation is. Relations with many indexes would also result in much object locks taken at each step. I am personally worried about the amount time spent waiting for long running transactions if you reindex per index rather than per relation. Because when you for one index wait on long running transactions nothing prevents new long transaction from starting, which we will have to wait for while reindexing the next index. If your database has many long running transactions more time will be spent waiting than the time spent working. Are the number of locks really a big deal compared to other costs involved here? REINDEX does a lot of expensive things like staring transactions, taking snapshots, scanning large tables, building a new index, etc. The trade off I see is between temporary disk usage and time spent waiting for transactions, and doing the REINDEX per relation allows for flexibility since people can still explicitly reindex per index of they want to. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On 03/29/2017 05:43 AM, Haribabu Kommi wrote: > Updated patch attached. I get a test failure in the pg_upgrade tests, but I do not have time right now to investigate. The failing test is "Restoring database schemas in the new cluster". I get the following in the log: command: "/home/andreas/dev/postgresql/src/bin/pg_upgrade/tmp_check/install//home/andreas/dev/postgresql-inst/bin/pg_dump" --host /home/andreas/dev/postgresql/src/bin/pg_upgrade --port 50848 --username andreas --schema-only --quote-all-identifiers --binary-upgrade --format=custom --file="pg_upgrade_dump_16385.custom" 'dbname='"'"'./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ'"'"'' >> "pg_upgrade_dump_16385.log" 2>&1 command: "/home/andreas/dev/postgresql/src/bin/pg_upgrade/tmp_check/install//home/andreas/dev/postgresql-inst/bin/pg_restore" --host /home/andreas/dev/postgresql/src/bin/pg_upgrade --port 50848 --username andreas --exit-on-error --verbose --dbname 'dbname='"'"'./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ'"'"'' "pg_upgrade_dump_16385.custom" >> "pg_upgrade_dump_16385.log" 2>&1 pg_restore: connecting to database for restore pg_restore: [archiver (db)] connection to database "./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ" failed: FATAL: database "./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ" does not exist Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rename pg_log directory?
On 03/27/2017 04:38 PM, Peter Eisentraut wrote: Committed. Thanks! While digging around a bit, I found in release-old.sgml that before PostgreSQL 7.2, pg_clog was called pg_log. Go figure. Yeah, I noticed that too when writing the patch. :) Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
Hi, Here is my review. I agree with the goal of the refactoring, as we want to make it easier to dump all the properties for the database object. But I think we need to solve the issues with the special casing of postgres and template1 which I personally would find very surprising if pg_dump -C did. On the other hand I think that we cannot get away from having pg_dumpall give them a special treatment. The nitpicking section is for minor code style errors. = Functional review I have not done an in depth functional review due to the discussion about how postgres and template1 should be handled. - The patch does not apply cleanly anymore - I do not like the change in behavior which causes "pg_dump -C postgres" to no longer include CREATE DATABASE. Special treatment of specific databases based on name makes sense in pg_dumpall, but not in pg_dump. - There are test failures in the pg_dump tests. It seems like some could be related to that you do not include CREATE DATABASE postgres in the dumps but I also get errors like 'ERROR: syntax error at or near "fault_tablespace"'. not ok 691 - createdb: dumps CREATE DATABASE postgres not ok 3003 - pg_dumpall_dbprivs: dumps CREATE DATABASE dump_test not ok 11 - restore full dump using environment variables for connection parameters not ok 12 - no dump errors not ok 13 - restore full dump with command-line options for connection parameters not ok 14 - no dump errors = Code review - As a response to "TBD -- is it necessary to get the default encoding": I think so, but either way changing this seems unrelated to this patch. - I know it is taken from the old pg_dumpall code, but the way the database owner is handled seems I wrong.think we should set it like the owner for other objects. And more importantly it should respect --no-owner. - The logic for switching database when setting the default table space is broken. You generate "\ connect" rather than "\connect". - I saw the comment "Note that we do not support initial privileges (pg_init_privs) on databases." and wondered: why not? I definitly think that we should support this. = Nitpicking - You should probably use SGML style over and for inline tags. - In "database-level properties such as Ownership, ACLs, [...]" I do not think that "Ownerships" shuld be capitalized. - There are two extra spaces on the lines below, and a space is missing after the closing tag. ALTER ROLE IN DATABASE ... SET commands. with --create option to dump ALTER ROLE IN DATABASE ... SET - On the following comment ".." should be "...", since that is the correct way to write an ellipsis. * Frame the ALTER .. SET .. commands and fill it in buf. - Rename arrayitem to configitem in makeAlterConfigCommand(). - In makeAlterConfigCommand() you should do "*pos++ = '\0';" rather than "*pos = 0;" and then remove the later + 1 so our code matches with the code in dumpFunc(). Either is correct, but it would be nice if both pieces of code looked more similar. - You removed an empty line in pg_backup_utils.h between globals variables and function declartions which I think should be left there. It should be directly after g_verbose. - There is something wrong with the indentation of the query for collecting info about databases in dumpDatabase() for PG >= 9.6. - Missing space before "'' as rdatacl" in dumpDatabase(), and a missing space at the end of the string. - Double space in 'FROM pg_database "' in dumpDatabase(). Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No more libedit?! - openssl plans to switch to APL2
On 08/01/2015 05:14 PM, Andres Freund wrote: According to https://www.openssl.org/blog/blog/2015/08/01/cla/ openssl is planning to relicense to the apache license 2.0. While APL2 is not compatible with GLP2 it *is* compatible with GPL3. Great! This means that the Debian packages will eventually be able to drop their LD_PRELOAD hack, which never worked perfectly due to compiling against libedit or libreadline header resulting in different binaries. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall
On 03/21/2017 08:02 AM, Haribabu Kommi wrote: Solution -1) Just ignore dumping these CREATE DATABASE commands and provide the user information in the documentation to create "postgres" and "template1" database in the target in case if they don't exist. If this kind of cases are very rare. Solution-2) Add a new command line option/some other settings to indicate the pg_dump execution is from pg_dumpall and follow the current refactored behavior, otherwise follow the earlier pg_dump behavior in handling CREATE DATABASE commands for "postgres" and "template1" databases. I am leaning towards (2) since I feel having pg_dump act differently depending on the name of the database is a quite surprising behavior. It makes more sense to let a tool like pg_dumpall handle logic like that. 2. In dumpDatabases function before calling the runPgDump command, Before refactoring, it used to connect to the database and dump "SET default_transaction_read_only = off;" to prevent some accidental overwrite of the target. I fixed it in the attached patch by removing the connection and dumping the set command. Does it needs the similar approach of solution-2) in previous problem and handle dumping the "SET default_transaction_read_only = off;" whenever the CREATE DATABASE and \connect command is issued? Hm, that is a bit annoying. I do not think we want to change any behavior here, either of pg_dump or pg_dumpall, but I also do not like having to add two new flags to pg_dump (one for including the ALTER DATABASE commands but not CREATE DATABASE, and another flag for default_transaction_read_only) or a special flag similar to --binary-upgrade. None of these options seem optimal to me, and I do not have any strong preference other than that we should avoid breaking pg_dump or changing behavior not related to the database attributes. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ICU integration
I am fine with this version of the patch. The issues I have with it, which I mentioned earlier in this thread, seem to be issues with ICU rather than with this patch. For example there seems to be no way for ICU to validate the syntax of the BCP 47 locales (or ICU's old format). But I think we will just have to accept the weirdness of how ICU handles locales. I think this patch is ready to be committed. Found a typo in the documentation: "The inspect the currently available locales" should be "To inspect the currently available locales". Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ICU integration
On 03/15/2017 05:33 PM, Peter Eisentraut wrote: Updated patch attached. Ok, I applied to patch again and ran the tests. I get a test failure on make check-world in the pg_dump tests but it can be fixed with the below. diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index 3cac4a9ae0..7d9c90363b 100644 --- a/src/bin/pg_dump/t/002_pg_dump.pl +++ b/src/bin/pg_dump/t/002_pg_dump.pl @@ -2422,7 +2422,7 @@ qr/^\QINSERT INTO test_fifth_table (col1, col2, col3, col4, col5) VALUES (NULL, 'CREATE COLLATION test0 FROM "C";', regexp => qr/^ - \QCREATE COLLATION test0 (lc_collate = 'C', lc_ctype = 'C');\E/xm, + \QCREATE COLLATION test0 (provider = 'libc', locale = 'C');\E/xm, like => { binary_upgrade => 1, clean=> 1, - I do not like how it is not obvious which is the default version of every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not "sv_standard%icu" as one might expect. Is this inherent to ICU or something we can work around? We get these keywords from ucol_getKeywordValuesForLocale(), which says "Given a key and a locale, returns an array of string values in a preferred order that would make a difference." So all those are supposedly different from each other. I believe you are mistaken. The locale "sv" is just an alias for "sv-u-standard" as far as I can tell. See the definition of the Swedish locale (http://source.icu-project.org/repos/icu/trunk/icu4c/source/data/coll/sv.txt) and there are just three collations: reformed (default), standard, search (plus eot and emoji which are inherited). I am also quite annoyed at col-emoji and col-eor (European Ordering Rules). They are defined at the root and inherited by all languages, but no language modifies col-emoji for their needs which makes it a foot gun. See the Danish sorting example below where at least I expected the same order. For col-eor it makes a bit more sense since I would expect the locales en_GB-u-col-eot and sv_SE-u-col-eot to collate exactly the same. # SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE "da-x-icu"; x a k aa (3 rows) # SELECT * FROM (VALUES ('a'), ('k'), ('aa')) q (x) ORDER BY x COLLATE "da-u-co-emoji-x-icu"; x a aa k (3 rows) It seems ICU has made quite the mess here, and I am not sure to what degree we need to handle it to avoid our users getting confused. I need to think some of it, and would love input from others. Maybe the right thing to do is to ignore the issue with col-emoji, but I would want to do something about the default collations. - ICU talks about a new syntax for locale extensions (old: de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page http://userguide.icu-project.org/collation/api. Is this something we need to car about? It looks like we currently use the old format, and while I personally prefer it I am not sure we should rely on an old syntax. Interesting. I hadn't see this before, and the documentation is sparse. But it seems that this refers to BCP 47 language tags, which seem like a good idea. So what I have done is change all the predefined ICU collations to be named after the BCP 47 scheme, with a "private use" -x-icu suffix (instead of %icu). The preserves the original idea but uses a standard naming scheme. I'm not terribly worried that they are going to remove the "old" locale naming, but just to be forward looking, I have changed it so that the collcollate entries are made using the "new" naming for ICU >=54. Sounds good. - I get an error when creating a new locale. #CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Success # CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Resource temporarily unavailable Time: 1.109 ms Hmm, that's pretty straightforward code. What is your operating system? What are the build options? Does it work without this patch? This issue is unrelated to ICU. I had forgot to specify provider so the eorrs are correct (even though that the value of the errno is weird). - For the collprovider is it really correct to say that 'd' is the default value as it does in catalogs.sgml? It doesn't say it's the default value, it says it uses the database default. This is all a bit confusing. We have a collation object named "default", which uses the locale set for the database. That's been that way for a while. Now when introducing the collation providers, that "default" collation gets its own collprovider category 'd'. That is not really used anywhere, but we have to put something there. Ah, I see now. Hm, that is a bit awkward but I cannot think of a cleaner alternative. - I do not know what the policy for formatting the documentation is, but some of the paragraphs are in need of re-wrapping. Hmm, I don't see anything terribly bad. Maybe it is just me
Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)
On 03/19/2017 07:35 PM, Stephen Frost wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: Stephen Frostwrites: (Or in other words, we've been getting along fine with these script names for circa twenty years, so what's the rush to change them RIGHT NOW?) To be clear, I'm not in any particular rush to change them 'RIGHT NOW'. I tend to agree with Magnus that we're doing a lot of other things in PG10 and that makes it a bit of a natural point, but I don't hold that position terribly strongly. On the other hand, I do not relish the idea of providing backwards-compatibility for every user-facing change we do for 5 years and that's where I feel this approach is encouraging us to go. I only think that argument is only applicable where the changes are closely related, e.g. renaming pg_clog, pg_xlog and pg_log in the same release. I do not see any strong connection between createuser and pg_xlog. As for if we should have backwards compatibility for the old names I am leaning weakly for providing it in the case of createuser. I can see end users being pissed off that the createuser command is suddenly gone without any warning when they upgrade. On the flip side I have no idea how much work it would be to maintain those legacy names. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing binaries (was: createlang/droplang deprecated)
On 03/18/2017 09:12 PM, Magnus Hagander wrote: createdb, dropdb - also not clear they're about postgres, more likely to be used by mistake but not that bad. That said, do they add any *value* beyond what you can do with psql -c "CREATE DATABASE"? I don't really see one, so I'd suggest dropping these too. The value they add is that they quote the database name and options correctly which makes them easier to use safely and reliably in shell scripts. And unless I am missing something obvious I do not think there is any easy way for a beginner to do this with just psql. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \h tab-completion
On 03/17/2017 12:01 AM, Peter Eisentraut wrote: Committed with some tweaking. Thanks! Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Faster Expression Processing v4
Hi, I got a test failure with this version of the patch in the postges_fdw. It looks to me like it was caused by a typo in the source code which is fixed in the attached patch. After applying this patch check-world passes. Andreas diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c index 1763be5cf0..2ba5a2ea69 100644 --- a/src/backend/executor/nodeTidscan.c +++ b/src/backend/executor/nodeTidscan.c @@ -42,7 +42,6 @@ static void TidListCreate(TidScanState *tidstate); static int itemptr_comparator(const void *a, const void *b); static TupleTableSlot *TidNext(TidScanState *node); - /* * Compute the list of TIDs to be visited, by evaluating the expressions * for them. @@ -101,6 +100,7 @@ TidListCreate(TidScanState *tidstate) else if (IsCTIDVar(arg2)) exprstate = ExecInitExpr((Expr *) linitial(fex->args), >ss.ps); + else elog(ERROR, "could not identify CTID variable"); itemptr = (ItemPointer) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \h tab-completion
On 03/01/2017 02:47 PM, Peter Eisentraut wrote: Instead of creating another copy of list_ALTER, let's use the words_after_create list and write a version of create_command_generator/drop_command_generator. Good idea. Here is a patch with that. Andreas commit 7d691929f5814da87bb8a532e7dcfa2bd1dc9f15 Author: Andreas Karlsson <andr...@proxel.se> Date: Fri Feb 3 13:05:48 2017 +0100 Add compleition for \help DROP|ALTER diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index e8458e939e..3df7636c5b 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -982,10 +982,11 @@ typedef struct #define THING_NO_CREATE (1 << 0) /* should not show up after CREATE */ #define THING_NO_DROP (1 << 1) /* should not show up after DROP */ -#define THING_NO_SHOW (THING_NO_CREATE | THING_NO_DROP) +#define THING_NO_ALTER (1 << 2) /* should not show up after ALTER */ +#define THING_NO_SHOW (THING_NO_CREATE | THING_NO_DROP | THING_NO_ALTER) static const pgsql_thing_t words_after_create[] = { - {"ACCESS METHOD", NULL, NULL}, + {"ACCESS METHOD", NULL, NULL, THING_NO_ALTER}, {"AGGREGATE", NULL, _for_list_of_aggregates}, {"CAST", NULL, NULL}, /* Casts have complex structures for names, so * skip it */ @@ -999,6 +1000,7 @@ static const pgsql_thing_t words_after_create[] = { {"CONVERSION", "SELECT pg_catalog.quote_ident(conname) FROM pg_catalog.pg_conversion WHERE substring(pg_catalog.quote_ident(conname),1,%d)='%s'"}, {"DATABASE", Query_for_list_of_databases}, {"DICTIONARY", Query_for_list_of_ts_dictionaries, NULL, THING_NO_SHOW}, + {"DEFAULT PRIVILEGES", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"DOMAIN", NULL, _for_list_of_domains}, {"EVENT TRIGGER", NULL, NULL}, {"EXTENSION", Query_for_list_of_extensions}, @@ -1006,12 +1008,13 @@ static const pgsql_thing_t words_after_create[] = { {"FOREIGN TABLE", NULL, NULL}, {"FUNCTION", NULL, _for_list_of_functions}, {"GROUP", Query_for_list_of_roles}, - {"LANGUAGE", Query_for_list_of_languages}, {"INDEX", NULL, _for_list_of_indexes}, + {"LANGUAGE", Query_for_list_of_languages, NULL, THING_NO_ALTER}, + {"LARGE OBJECT", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"MATERIALIZED VIEW", NULL, _for_list_of_matviews}, {"OPERATOR", NULL, NULL}, /* Querying for this is probably not such a * good idea. */ - {"OWNED", NULL, NULL, THING_NO_CREATE}, /* for DROP OWNED BY ... */ + {"OWNED", NULL, NULL, THING_NO_CREATE | THING_NO_ALTER}, /* for DROP OWNED BY ... */ {"PARSER", Query_for_list_of_ts_parsers, NULL, THING_NO_SHOW}, {"POLICY", NULL, NULL}, {"PUBLICATION", Query_for_list_of_publications}, @@ -1021,15 +1024,16 @@ static const pgsql_thing_t words_after_create[] = { {"SEQUENCE", NULL, _for_list_of_sequences}, {"SERVER", Query_for_list_of_servers}, {"SUBSCRIPTION", Query_for_list_of_subscriptions}, + {"SYSTEM", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"TABLE", NULL, _for_list_of_tables}, {"TABLESPACE", Query_for_list_of_tablespaces}, - {"TEMP", NULL, NULL, THING_NO_DROP}, /* for CREATE TEMP TABLE ... */ + {"TEMP", NULL, NULL, THING_NO_DROP | THING_NO_ALTER}, /* for CREATE TEMP TABLE ... */ {"TEMPLATE", Query_for_list_of_ts_templates, NULL, THING_NO_SHOW}, {"TEXT SEARCH", NULL, NULL}, {"TRIGGER", "SELECT pg_catalog.quote_ident(tgname) FROM pg_catalog.pg_trigger WHERE substring(pg_catalog.quote_ident(tgname),1,%d)='%s' AND NOT tgisinternal"}, {"TYPE", NULL, _for_list_of_datatypes}, - {"UNIQUE", NULL, NULL, THING_NO_DROP}, /* for CREATE UNIQUE INDEX ... */ - {"UNLOGGED", NULL, NULL, THING_NO_DROP}, /* for CREATE UNLOGGED TABLE + {"UNIQUE", NULL, NULL, THING_NO_DROP | THING_NO_ALTER}, /* for CREATE UNIQUE INDEX ... */ + {"UNLOGGED", NULL, NULL, THING_NO_DROP | THING_NO_ALTER}, /* for CREATE UNLOGGED TABLE * ... */ {"USER", Query_for_list_of_roles}, {"USER MAPPING FOR", NULL, NULL}, @@ -1042,6 +1046,7 @@ static const pgsql_thing_t words_after_create[] = { static char **psql_completion(const char *text, int start, int end); static char *create_command_generator(const char *text, int state); static char *drop_command_generator(const char *text, int state); +static char *alter_command_generator(const char *text, int state); static char *complete_from_query(const char *text, int state); static char *complete_from_schema_query(const char *text, int state); static char *_complete_from_query(int is_schema_query, @@ -1316,6 +1321,17 @@ psql_completion(const char *text, i
Re: [HACKERS] ICU integration
On 03/09/2017 10:13 PM, Peter Eisentraut wrote: - Naming of collations: Are we happy with the "de%icu" naming? I might have come up with that while reviewing the IPv6 zone index patch. ;-) An alternative might be "de$icu" for more Oracle vibe and avoiding the need for double quotes in some cases. (But we have mixed-case names like "de_AT%icu", so further changes would be necessary to fully get rid of the need for quoting.) A more radical alternative would be to install ICU locales in a different schema and use the search_path, or even have a separate search path setting for collations only. Which leads to ... I do not like the schema based solution since search_path already gives us enough headaches. As for the naming I am fine with the current scheme. Would be nice with something we did not have to quote, but I do not like using dollar signs since they are already use for other things. - Selecting default collation provider: Maybe we want a setting, say in initdb, to determine which provider's collations get the "good" names? Maybe not necessary for this release, but something to think about. This does not seem necessary for the initial release. - Currently (in this patch), we check a collation's version when it is first used. But, say, after pg_upgrade, you might want to check all of them right away. What might be a good interface for that? (Possibly, we only have to check the ones actually in use, and we have dependency information for that.) How about adding a SQL level function for checking this which can be called by pg_upgrade? = Review Here is an initial review. I will try to find some time to do more testing later this week. This is a really useful feature given the poor quality of collation support libc. Just that ICU versions the encodings is huge, and the larger range of available collations with high configurability. Additionally being able to use abbreviated keys again would be huge. ICU also supports writing your own collations and modifying existing ones, something which might be possible to expose in the future. In general ICU offers a lot for users who care about the details of text collation. == Functional - Generally things seem to work fine and as expected. - I get a test failures in the default test suite due to not having the tr_TR locale installed. I would assume that this would be pretty common for hackers. *** *** 428,443 -- to_char SET lc_time TO 'tr_TR'; SELECT to_char(date '2010-04-01', 'DD TMMON '); to_char - ! 01 NIS 2010 (1 row) SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr%icu"); to_char - ! 01 NİS 2010 (1 row) -- backwards parsing --- 428,444 -- to_char SET lc_time TO 'tr_TR'; + ERROR: invalid value for parameter "lc_time": "tr_TR" SELECT to_char(date '2010-04-01', 'DD TMMON '); to_char - ! 01 APR 2010 (1 row) SELECT to_char(date '2010-04-01', 'DD TMMON ' COLLATE "tr%icu"); to_char - ! 01 APR 2010 (1 row) -- backwards parsing == - The code no longer compiles without HAVE_LOCALE_T. - I do not like how it is not obvious which is the default version of every locale. E.g. I believe that "sv%icu" is "sv_reformed%icu" and not "sv_standard%icu" as one might expect. Is this inherent to ICU or something we can work around? - ICU talks about a new syntax for locale extensions (old: de_DE@collation=phonebook, new: de_DE_u_co_phonebk) at this page http://userguide.icu-project.org/collation/api. Is this something we need to car about? It looks like we currently use the old format, and while I personally prefer it I am not sure we should rely on an old syntax. - I get an error when creating a new locale. #CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Success # CREATE COLLATION sv2 (LOCALE = 'sv'); ERROR: could not create locale "sv": Resource temporarily unavailable Time: 1.109 ms == Code review - For the collprovider is it really correct to say that 'd' is the default value as it does in catalogs.sgml? - I do not know what the policy for formatting the documentation is, but some of the paragraphs are in need of re-wrapping. - Add a hint to "ERROR: conflicting or redundant options". The error message is pretty unclear. - I am not a fan of this patch comparing the encoding with a -1 literal. How about adding -1 as a value to the enum? See the example below for a place where the patch compares with -1. ereport(NOTICE, (errcode(ERRCODE_DUPLICATE_OBJECT), -errmsg("collation \"%s\" for encoding \"%s\" already exists, skipping", - collname, pg_encoding_to_char(collencoding; +collencoding == -1 +? errmsg("collation \"%s\" already exists, skipping", +
Re: [HACKERS] \h tab-completion
On 03/13/2017 03:56 PM, David Steele wrote: Do you know when you will have a new patch available for review that incorporates Peter's request? I believe I will find the time to finish it some time in a couple of days. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/13/2017 03:11 AM, Andreas Karlsson wrote: I also fixed the the code to properly support triggers. And by "support triggers" I actually meant fixing the support for moving the foreign keys to the new index. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/02/2017 03:10 AM, Michael Paquier wrote: On Wed, Mar 1, 2017 at 2:21 AM, Andreas Karlsson <andr...@proxel.se> wrote: +/* + * Copy contraint flags for old index. This is safe because the old index + * guaranteed uniquness. + */ +newIndexForm->indisprimary = oldIndexForm->indisprimary; +oldIndexForm->indisprimary = false; +newIndexForm->indisexclusion = oldIndexForm->indisexclusion; +oldIndexForm->indisexclusion = false; [...] +deleteDependencyRecordsForClass(RelationRelationId, newIndexOid, +RelationRelationId, DEPENDENCY_AUTO); +deleteDependencyRecordsForClass(RelationRelationId, oldIndexOid, +ConstraintRelationId, DEPENDENCY_INTERNAL); + +// TODO: pg_depend for old index? Spotted one of my TODO comments there so I have attached a patch where I have cleaned up that function. I also fixed the the code to properly support triggers. There is a lot of mumbo-jumbo in the patch to create the exact same index definition as the original one being reindexed, and that's a huge maintenance burden for the future. You can blame me for that in the current patch. I am wondering if it would not just be better to generate a CREATE INDEX query string and then use the SPI to create the index, and also do the following extensions at SQL level: - Add a sort of WITH NO DATA clause where the index is created, so the index is created empty, and is marked invalid and not ready. - Extend pg_get_indexdef_string() with an optional parameter to enforce the index name to something else, most likely it should be extended with the WITH NO DATA/INVALID clause, which should just be a storage parameter by the way. By doing something like that what the REINDEX CONCURRENTLY code path should just be careful about is that it chooses an index name that avoids any conflicts. Hm, I am not sure how much that would help since a lot of the mumb-jumbo is by necessity in the step where we move the constraints over from the old index to the new. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index 306def4a15..ca1aeca65f 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -923,7 +923,8 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), - ANALYZE, CREATE INDEX CONCURRENTLY, and + ANALYZE, CREATE INDEX CONCURRENTLY, + REINDEX CONCURRENTLY, ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ). diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 3908ade37b..3449c0af73 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + REINDEX will perform a concurrent build if + CONCURRENTLY is specified. To build the index without interfering + with production you should drop the index and reissue either the + CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + command. Indexes of toast relations can be rebuilt with REINDEX + CONCURRENTLY. @@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + + VERBOSE @@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + + + +Rebuilding an index can interfere with regular operation of a database. +Normally PostgreSQL locks the table whose index is rebuilt +against writes and performs the entire index build with a single scan of the +table. Other transactions can still read the table, but if they try to +insert, update, or delete rows in the table they will block unti
Re: [HACKERS] rename pg_log directory?
On 03/09/2017 11:25 PM, Bruce Momjian wrote: "data" and "base" where chosen because it is a "data-base", but with the pg_ prefixes it would be a pg_data_pg_base. ;-) Haha, I had not spotted that one despite always naming my data directory "data" while developing. Fun little tidbit there. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] adding an immutable variant of to_date
On 03/07/2017 09:56 PM, Sven R. Kunze wrote: On 07.03.2017 03:21, Andreas Karlsson wrote: 1) I do not think we currently allow setting the locale like this anywhere, so this will introduce a new concept to PostgreSQL. And you will probably need to add support for caching per locale. Good to know. Could you explain what you mean by "caching per locale"? The current code for to_char will on the first call to to_char build arrays with the localized names of the week days and the months. I suspect that you may need to build something similar but a set of arrays per locale. See the DCH_to_char function and its call to cache_locale_time. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/08/2017 03:48 AM, Robert Haas wrote: On Sun, Mar 5, 2017 at 7:13 PM, Andreas Karlsson <andr...@proxel.se> wrote: And I would argue that his feature is useful for quite many, based on my experience running a semi-large database. Index bloat happens and without REINDEX CONCURRENTLY it can be really annoying to solve, especially for primary keys. Certainly more people have problems with index bloat than the number of people who store index oids in their database. Yeah, but that's not the only wart, I think. The only two potential issues I see with the patch are: 1) That the index oid changes visibly to external users. 2) That the code for moving the dependencies will need to be updated when adding new things which refer to an index oid. Given how useful I find REINDEX CONCURRENTLY I think these warts are worth it given that the impact is quite limited. I am of course biased since if I did not believe this I would not pursue this solution in the first place. For example, I believe (haven't looked at this patch series in a while) that the patch takes a lock and later escalates the lock level. If so, that could lead to doing a lot of work to build the index and then getting killed by the deadlock detector. This version of the patch no longer does that. For my use case escalating the lock would make this patch much less interesting. The highest lock level taken is the same one as the initial one (SHARE UPDATE EXCLUSIVE). The current patch does on a high level (very simplified) this: 1. CREATE INDEX CONCURRENTLY ind_new; 2. Atomically move all dependencies from ind to ind_new, rename ind to ind_old, and rename ind_new to ind. 3. DROP INDEX CONCURRENTLY ind_old; The actual implementation is a bit more complicated in reality, but no part escalates the lock level over what would be required by the steps for creating and dropping indexes concurrently Also, if by any chance you think (or use any software that thinks) that OIDs for system objects are a stable identifier, this will be the first case where that ceases to be true. If the system is shut down or crashes or the session is killed, you'll be left with stray objects with names that you've never typed into the system. I'm sure you're going to say "don't worry, none of that is any big deal" and maybe you're right. Hm, I cannot think of any real life scenario where this will be an issue based on my personal experience with PostgreSQL, but if you can think of one please provide it. I will try to ponder some more on this myself. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] adding an immutable variant of to_date
On 03/03/2017 10:41 PM, Sven R. Kunze wrote: What do you think? I have some thoughts: 1) I do not think we currently allow setting the locale like this anywhere, so this will introduce a new concept to PostgreSQL. And you will probably need to add support for caching per locale. 2) As far as I can tell from reading the code to_date currently ignores the M suffix which indicates that you want localized month/day names, so i guess that to_date is currently immutable. Maybe it is stable due to the idea that we may want to support the M suffix in the future. 3) I do not like the to_date function. It is much too forgiving with invalid input. For example 2017-02-30 because 2017-03-02. Also just ignoring the M suffix in the format string seems pretty bad Personally I would rather see a new date parsing function which is easier to work with or somehow fix to_date without pissing too many users off, but I have no idea if this is a view shared with the rest of the community. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rename pg_log directory?
On 02/27/2017 03:05 PM, Peter Eisentraut wrote: How about changing the default for log_directory from 'pg_log' to, say, 'log'? I have attached a patch which does this. I do not care much about which name is picked as long as we get rid off the "pg_" prefix. Btw, is there a reason for why global and base do not have the "pg_" prefix? Andreas commit 0b71fcdb328f05349775675e0491ba1b82127d4e Author: Andreas Karlsson <andr...@proxel.se> Date: Mon Mar 6 23:52:49 2017 +0100 Rename default log directory from pg_log to log diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index cd82c04b05..4ee1f605bc 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4324,8 +4324,8 @@ SELECT * FROM parent WHERE key = 2400; find the logs currently in use by the instance. Here is an example of this file's content: -stderr pg_log/postgresql.log -csvlog pg_log/postgresql.csv +stderr log/postgresql.log +csvlog log/postgresql.csv current_logfiles is recreated when a new log file @@ -4427,7 +4427,7 @@ local0.*/var/log/postgresql cluster data directory. This parameter can only be set in the postgresql.conf file or on the server command line. -The default is pg_log. +The default is log. diff --git a/doc/src/sgml/file-fdw.sgml b/doc/src/sgml/file-fdw.sgml index 309a303e03..359f222352 100644 --- a/doc/src/sgml/file-fdw.sgml +++ b/doc/src/sgml/file-fdw.sgml @@ -262,7 +262,7 @@ CREATE FOREIGN TABLE pglog ( location text, application_name text ) SERVER pglog -OPTIONS ( filename '/home/josh/9.1/data/pg_log/pglog.csv', format 'csv' ); +OPTIONS ( filename '/home/josh/9.1/data/log/pglog.csv', format 'csv' ); diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 0707f66631..69b9cdacff 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -3298,7 +3298,7 @@ static struct config_string ConfigureNamesString[] = GUC_SUPERUSER_ONLY }, _directory, - "pg_log", + "log", check_canonical_path, NULL, NULL }, { diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 157d775853..e6dbc31591 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -343,7 +343,7 @@ # (change requires restart) # These are only used if logging_collector is on: -#log_directory = 'pg_log' # directory where log files are written, +#log_directory = 'log' # directory where log files are written, # can be absolute or relative to PGDATA #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern, # can include strftime() escapes diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm index e5cb348f4c..1c87e39e0d 100644 --- a/src/test/perl/PostgresNode.pm +++ b/src/test/perl/PostgresNode.pm @@ -560,7 +560,7 @@ sub _backup_fs $backup_path, filterfn => sub { my $src = shift; - return ($src ne 'pg_log' and $src ne 'postmaster.pid'); + return ($src ne 'log' and $src ne 'postmaster.pid'); }); if ($hot) diff --git a/src/test/perl/RecursiveCopy.pm b/src/test/perl/RecursiveCopy.pm index 3e98813286..457488ba5b 100644 --- a/src/test/perl/RecursiveCopy.pm +++ b/src/test/perl/RecursiveCopy.pm @@ -48,9 +48,9 @@ attempted. RecursiveCopy::copypath('/some/path', '/empty/dir', filterfn => sub { - # omit pg_log and contents + # omit log and contents my $src = shift; - return $src ne 'pg_log'; + return $src ne 'log'; } ); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rename pg_log directory?
On 03/01/2017 05:49 AM, Peter Eisentraut wrote: On 2/27/17 09:51, Tom Lane wrote: No objection to the basic point, but "log" seems perhaps a little too generic to me. Would something like "server_log" be better? Well, "log" is pretty well established. There is /var/log, and if you unpack a, say, Kafka or Cassandra distribution, they also come with a log or logs directory. +1, though I am also fine with server_log. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/05/2017 07:56 PM, Robert Haas wrote: On Sat, Mar 4, 2017 at 12:34 PM, Andres Freundwrote: I agree that'd it be nicer not to have this, but not having the feature at all is a lot worse than this wart. I, again, give that a firm "maybe". If the warts end up annoying 1% of the users who try to use this feature, then you're right. If they end up making a substantial percentage of people who try to use this feature give up on it, then we've added a bunch of complexity and future code maintenance for little real gain. I'm not ruling out the possibility that you're 100% correct, but I'm not nearly as convinced of that as you are. I agree that these warts are annoying but I also think the limitations can be explained pretty easily to users (e.g. by explaining it in the manual how REINDEX CONCURRENTLY creates a new index to replace the old one), and I think that is the important question about if a useful feature with warts should be merged or not. Does it make things substantially harder for the average user to grok? And I would argue that his feature is useful for quite many, based on my experience running a semi-large database. Index bloat happens and without REINDEX CONCURRENTLY it can be really annoying to solve, especially for primary keys. Certainly more people have problems with index bloat than the number of people who store index oids in their database. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 03/02/2017 02:25 AM, Jim Nasby wrote: On 2/28/17 11:21 AM, Andreas Karlsson wrote: The only downside I can see to this approach is that we no logner will able to reindex catalog tables concurrently, but in return it should be easier to confirm that this approach can be made work. Another downside is any stored regclass fields will become invalid. Admittedly that's a pretty unusual use case, but it'd be nice if there was at least a way to let users fix things during the rename phase (perhaps via an event trigger). Good point, but I agree with Andres here. Having REINDEX CONCURRENTLY issue event triggers seems strange to me. While it does create and drop indexes as part of its implementation, it is actually just an index maintenance job. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
Hi, Here is a third take on this feature, heavily based on Michael Paquier's 2.0 patch. This time the patch does not attempt to preserve the index oids, but instead creates new indexes and moves all dependencies from the old indexes to the new before dropping the old ones. The only downside I can see to this approach is that we no logner will able to reindex catalog tables concurrently, but in return it should be easier to confirm that this approach can be made work. This patch relies on that we can change the indisvalid flag of indexes transactionally, and as far as I can tell this is the case now that we have MVCC for the catalog updates. The code does some extra intermediate commits when building the indexes to avoid long running transactions. How REINDEX CONCURRENTLY operates: For each table: 1. Create new indexes without populating them, and lock the tables and indexes for the session. 2. After waiting for all running transactions populate each index in a separate transaction and set them to ready. 3. After waiting again for all running transactions validate each index in a separate transaction (but not setting them to valid just yet). 4. Swap all dependencies over from each old index to the new index and rename the old and the new indexes (from the to _ccold and _new to ), and set isprimary and isexclusion flags. Here we also mark the new indexes as valid and the old indexes as invalid. 5. After waiting for all running transactions we change each index from invalid to dead. 6. After waiting for all running transactions we drop each index. 7. Drop all session locks. Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index 306def4a15..ca1aeca65f 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -923,7 +923,8 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), - ANALYZE, CREATE INDEX CONCURRENTLY, and + ANALYZE, CREATE INDEX CONCURRENTLY, + REINDEX CONCURRENTLY, ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ). diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 3908ade37b..3449c0af73 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + REINDEX will perform a concurrent build if + CONCURRENTLY is specified. To build the index without interfering + with production you should drop the index and reissue either the + CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + command. Indexes of toast relations can be rebuilt with REINDEX + CONCURRENTLY. @@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + + VERBOSE @@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + + + +Rebuilding an index can interfere with regular operation of a database. +Normally PostgreSQL locks the table whose index is rebuilt +against writes and performs the entire index build with a single scan of the +table. Other transactions can still read the table, but if they try to +insert, update, or delete rows in the table they will block until the +index rebuild is finished. This could have a severe effect if the system is +a live production database. Very large tables can take many hours to be +indexed, and even for smaller tables, an index rebuild can lock out writers +for periods that are unacceptably long for a production system. + + + +PostgreSQL supports rebuilding indexes with minimum locking +of writes. This method is invoked by specifying the +CONCURRENTLY option of
Re: [HACKERS] Disallowing multiple queries per PQexec()
On 02/28/2017 03:13 PM, Bruce Momjian wrote: I might have added that one; the text is: Consider disallowing multiple queries in PQexec() as an additional barrier to SQL injection attacks and it is a "consider" item. Should it be moved to the Wire Protocol Changes / v4 Protocol section or removed? A new protocol version wont solve the breakage of the C API, so I am not sure we can ever drop this feature other than by adding a new function something in the protocol to support this. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Question about memory contexts in ReindexMultipleTables()
Hi, When working on REINDEX CONCURRENTLY I noticed that the new memory context created in the ReindexMultipleTables() seems pointless. The purpose claimed in the code for introducing the ReindexMultipleTables context is to make sure the list we build with relation IDs survive the commit, since a commit destroys the TopTransactionContext and ReindexMultipleTables() first runs one transaction to list which tables should be reindexed and then reindexes each index in a separate transaction. But in the first transactions where the lsit is built we actually never use TopTransactionContext, isntead PortalHeapMemory is used which is a memory context which does not go away until the REINDEX command has completed. So everything should work in the same way even if we just remove the ReindexMultipleTables memory context. Am I missing something? Should the ReindexMultipleTables memory context be removed, or should we switch to TopTransactionContext at the begining of ReindexMultipleTables() so temporary resources used in the initial transaction can be freed? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 02/17/2017 01:53 PM, Andreas Karlsson wrote: I am actually thinking about going the opposite direction (by reducing the number of times we call WaitForLockers), because it is not just about consuming transaction IDs, we also do not want to wait too many times for transactions to commit. I am leaning towards only calling WaitForLockersMultiple three times per table. 1. Between building and validating the new indexes. 2. Between setting the old indexes to invalid and setting them to dead 3. Between setting the old indexes to dead and dropping them Right now my patch loops over the indexes in step 2 and 3 and waits for lockers once per index. This seems rather wasteful. I have thought about that the code might be cleaner if we just looped over all indexes (and as a bonus the VERBOSE output would be more obvious), but I do not think it is worth waiting for lockers all those extra times. Thinking about this makes me wonder about why you decided to use a transaction per index in many of the steps rather than a transaction per step. Most steps should be quick. The only steps I think the makes sense to have a transaction per table are. 1) When building indexes to avoid long running transactions. 2) When validating the new indexes, also to avoid long running transactions. But when swapping the indexes or when dropping the old indexes I do not see any reason to not just use one transaction per step since we do not even have to wait for any locks (other than WaitForLockers which we just want to call once anyway since all indexes relate to the same table). Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 02/14/2017 04:56 AM, Michael Paquier wrote: On Tue, Feb 14, 2017 at 11:32 AM, Andreas Karlsson <andr...@proxel.se> wrote: On 02/13/2017 06:31 AM, Michael Paquier wrote: Er, something like that as well, no? DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. REINDEX (VERBOSE) currently prints one such line per index, which does not really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes on a relation at the same time. It is not immediately obvious how this should work. Maybe one such detail line per table? Hard to recall this thing in details with the time and the fact that a relation is reindexed by processing all the indexes once at each step. Hm... What if ReindexRelationConcurrently() actually is refactored in such a way that it processes all the steps for each index individually? This way you can monitor the time it takes to build completely each index, including its . This operation would consume more transactions but in the event of a failure the amount of things to clean up is really reduced particularly for relations with many indexes. This would as well reduce VERBOSE to print one line per index rebuilt. I am actually thinking about going the opposite direction (by reducing the number of times we call WaitForLockers), because it is not just about consuming transaction IDs, we also do not want to wait too many times for transactions to commit. I am leaning towards only calling WaitForLockersMultiple three times per table. 1. Between building and validating the new indexes. 2. Between setting the old indexes to invalid and setting them to dead 3. Between setting the old indexes to dead and dropping them Right now my patch loops over the indexes in step 2 and 3 and waits for lockers once per index. This seems rather wasteful. I have thought about that the code might be cleaner if we just looped over all indexes (and as a bonus the VERBOSE output would be more obvious), but I do not think it is worth waiting for lockers all those extra times. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 02/13/2017 06:31 AM, Michael Paquier wrote: - What should we do with REINDEX DATABASE CONCURRENTLY and the system catalog? I so not think we can reindex the system catalog concurrently safely, so what should REINDEX DATABASE do with the catalog indexes? Skip them, reindex them while taking locks, or just error out? System indexes cannot have their OIDs changed as they are used in syscache lookups. So just logging a warning looks fine to me, and the price to pay to avoid taking an exclusive lock even for a short amount of time. Good idea, I think I will add one line of warning if it finds any system index in the schema. - What level of information should be output in VERBOSE mode? Er, something like that as well, no? DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. REINDEX (VERBOSE) currently prints one such line per index, which does not really work for REINDEX (VERBOSE) CONCURRENTLY since it handles all indexes on a relation at the same time. It is not immediately obvious how this should work. Maybe one such detail line per table? This is a crasher: create table aa (a int primary key); reindex (verbose) schema concurrently public ; For invalid indexes sometimes snapshots are still active (after issuing the previous crash for example): =# reindex (verbose) table concurrently aa; WARNING: XX002: cannot reindex concurrently invalid index "public.aa_pkey_cct", skipping LOCATION: ReindexRelationConcurrently, indexcmds.c:2119 WARNING: 01000: snapshot 0x7fde12003038 still active Thanks for testing the patch! The crash was caused by things being allocated in the wrong memory context when reindexing multiple tables and therefore freed on the first intermediate commit. I have created a new memory context to handle this in which I only allocate the lists which need to survive between transactions.. Hm, when writing the above I just realized why ReindexTable/ReindexIndex did not suffer from the same bug. It is because the first transaction there allocated in the PortalHeapMemory context which survives commit. I really need to look at if there is a clean way to handle memory contexts in my patch. I also found the snapshot still active bug, it seems to have been caused by REINDEX TABLE CONCURRENTLY leaving an open snapshot which cannot be popped by PortalRunUtility(). Thanks again! Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REINDEX CONCURRENTLY 2.0
On 02/02/2015 03:10 PM, Andres Freund wrote: I think if we should instead just use the new index, repoint the dependencies onto the new oid, and then afterwards, when dropping, rename the new index one onto the old one. That means the oid of the index will change and some less than pretty grovelling around dependencies, but it still seems preferrable to what we're discussing here otherwise. I think that sounds like a good plan. The oid change does not seem like a too big deal to me, especially since that is what users will get now too. Do you still think this is the right way to solve this? I have attached my work in progress patch which implements and is very heavily based on Michael's previous work. There are some things left to do but I think I should have a patch ready for the next commitfest if people still like this type of solution. I also changed index_set_state_flags() to be transactional since I wanted the old index to become invalid at exactly the same time as the new becomes valid. From reviewing the code that seems like a safe change. A couple of bike shedding questions: - Is the syntax "REINDEX CONCUURENTLY " ok? - What should we do with REINDEX DATABASE CONCURRENTLY and the system catalog? I so not think we can reindex the system catalog concurrently safely, so what should REINDEX DATABASE do with the catalog indexes? Skip them, reindex them while taking locks, or just error out? - What level of information should be output in VERBOSE mode? What remains to be implemented: - Support for exclusion constraints - Look more into how I handle constraints (currently the temporary index too seems to have the PRIMARY KEY flag) - Support for the VERBOSE flag - More testing to catch bugs Andreas diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index 306def4a15..ca1aeca65f 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -923,7 +923,8 @@ ERROR: could not serialize access due to read/write dependencies among transact Acquired by VACUUM (without FULL), - ANALYZE, CREATE INDEX CONCURRENTLY, and + ANALYZE, CREATE INDEX CONCURRENTLY, + REINDEX CONCURRENTLY, ALTER TABLE VALIDATE and other ALTER TABLE variants (for full details see ). diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml index 3908ade37b..24464020cd 100644 --- a/doc/src/sgml/ref/reindex.sgml +++ b/doc/src/sgml/ref/reindex.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation -REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } name +REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name @@ -68,9 +68,12 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } CONCURRENTLY option failed, leaving an invalid index. Such indexes are useless but it can be convenient to use REINDEX to rebuild them. Note that - REINDEX will not perform a concurrent build. To build the - index without interfering with production you should drop the index and - reissue the CREATE INDEX CONCURRENTLY command. + REINDEX will perform a concurrent build if + CONCURRENTLY is specified. To build the index without interfering + with production you should drop the index and reissue either the + CREATE INDEX CONCURRENTLY or REINDEX CONCURRENTLY + command. Indexes of toast relations can be rebuilt with REINDEX + CONCURRENTLY. @@ -152,6 +155,21 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } +CONCURRENTLY + + + When this option is used, PostgreSQL will rebuild the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard reindex build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + see . + + + + + VERBOSE @@ -231,6 +249,172 @@ REINDEX [ ( VERBOSE ) ] { INDEX | TABLE | SCHEMA | DATABASE | SYSTEM } + + Rebuilding Indexes Concurrently + + + index + rebuilding concurrently + + + +Rebuilding an index can interfere with regular operation of a database. +Normally PostgreSQL locks the table whose index is rebuilt +against writes and performs the entire index build with a single scan of the +table. Other transactions can still read the table, but if they try to +insert, update, or delete rows in the table they will block until the +index rebuild is finished. This could have a severe effect if the system is +a live production database. Very large tables can take many hours to be +indexed, and even for smaller tables, an index rebuild can lock out writers +for periods that are unacceptably long for a production system. + + + +PostgreSQL supports rebuilding
Re: [HACKERS] 'text' instead of 'unknown' in Postgres 10
On 02/07/2017 03:14 PM, Daniele Varrazzo wrote: In psycopg '{}'::unknown is treated specially as an empty array and converted into an empty list, which allows empty lists to be passed to the server as arrays and returned back to python. Without the special case, empty lists behave differently from non-empty ones. It seems this behaviour cannot be maintained on PG 10 and instead users need to specify some form of cast for their placeholder. Previously this would have worked "as expected" and the 4th argument would have been an empty list: cur.execute("SELECT %s, %s, %s, %s", (['x'], [42], [date(2017,1,1)], [])); cur.fetchone() (['x'], [42], [datetime.date(2017, 1, 1)], '{}') As Tom wrote this is the result of an intentional change, but no matter if that change is a good thing or not the above behavior sounds rather fragile. To me it does not seem safe to by default just assume that '{}' means the empty array, it might also have been intended to be the Python string "{}", the empty JSON object, or entirely something different. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] \h tab-completion
On 01/25/2017 07:13 AM, Michael Paquier wrote: What I think you should do is making the code path of \\h smarter with some exceptions by using TailMatchesCS2() for ALTER. There is as well the case of DROP commands that should be treated by the way. Yes, I think that is correct approach. I have attached a patch where I add completion for \h ALTER and \h DROP. Andreas commit 045c92b10eb8777d29fc920c55561d645c0b8f30 Author: Andreas Karlsson <andr...@proxel.se> Date: Fri Feb 3 13:05:48 2017 +0100 Add compleition for \help DROP|ALTER diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index d6fffcf42f..653630dac2 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -1262,6 +1262,17 @@ psql_completion(const char *text, int start, int end) (previous_words_count >= 2 && \ word_matches_cs(p1, prev_wd) && \ word_matches_cs(p2, prev2_wd)) +#define TailMatchesCS3(p3, p2, p1) \ + (previous_words_count >= 3 && \ + word_matches_cs(p1, prev_wd) && \ + word_matches_cs(p2, prev2_wd) && \ + word_matches_cs(p3, prev3_wd)) +#define TailMatchesCS4(p4, p3, p2, p1) \ + (previous_words_count >= 4 && \ + word_matches_cs(p1, prev_wd) && \ + word_matches_cs(p2, prev2_wd) && \ + word_matches_cs(p3, prev3_wd) && \ + word_matches_cs(p4, prev4_wd)) /* * Macros for matching N words beginning at the start of the line, @@ -3256,8 +3267,51 @@ psql_completion(const char *text, int start, int end) else if (TailMatchesCS1("\\encoding")) COMPLETE_WITH_QUERY(Query_for_list_of_encodings); - else if (TailMatchesCS1("\\h") || TailMatchesCS1("\\help")) + else if (TailMatchesCS1("\\h|\\help")) COMPLETE_WITH_LIST(sql_commands); + else if (TailMatchesCS2("\\h|\\help", MatchAny)) + { + if (TailMatches1("DROP")) + matches = completion_matches(text, drop_command_generator); + else if (TailMatches1("ALTER")) + { + static const char *const list_ALTER[] = + {"AGGREGATE", "COLLATION", "CONVERSION", "DATABASE", "DEFAULT PRIVILEGES", "DOMAIN", +"EVENT TRIGGER", "EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", "FUNCTION", +"GROUP", "INDEX", "LANGUAGE", "LARGE OBJECT", "MATERIALIZED VIEW", "OPERATOR", +"POLICY", "PUBLICATION", "ROLE", "RULE", "SCHEMA", "SERVER", "SEQUENCE", +"SUBSCRIPTION", "SYSTEM", "TABLE", "TABLESPACE", "TEXT SEARCH", "TRIGGER", "TYPE", + "USER", "USER MAPPING FOR", "VIEW", NULL}; + + COMPLETE_WITH_LIST(list_ALTER); + } + } + else if (TailMatchesCS3("\\h|\\help", MatchAny, MatchAny)) + { + if (TailMatches2("DROP", "ACCESS")) + COMPLETE_WITH_CONST("METHOD"); + else if (TailMatches2("ALTER", "DEFAULT")) + COMPLETE_WITH_CONST("PRIVILEGES"); + else if (TailMatches2("ALTER|DROP", "EVENT")) + COMPLETE_WITH_CONST("TRIGGER"); + else if (TailMatches2("ALTER|DROP", "FOREIGN")) + COMPLETE_WITH_LIST2("DATA WRAPPER", "TABLE"); + else if (TailMatches2("ALTER", "LARGE")) + COMPLETE_WITH_CONST("OBJECT"); + else if (TailMatches2("ALTER|DROP", "MATERIALIZED")) + COMPLETE_WITH_CONST("VIEW"); + else if (TailMatches2("ALTER|DROP", "TEXT")) + COMPLETE_WITH_CONST("SEARCH"); + else if (TailMatches2("ALTER|DROP", "USER")) + COMPLETE_WITH_CONST("MAPPING FOR"); + } + else if (TailMatchesCS4("\\h|\\help", MatchAny, MatchAny, MatchAny)) + { + if (TailMatches3("ALTER|DROP", "FOREIGN", "DATA")) + COMPLETE_WITH_CONST("WRAPPER"); + else if (TailMatches3("ALTER|DROP", "USER", "MAPPING")) + COMPLETE_WITH_CONST("FOR"); + } else if (TailMatchesCS1("\\l*") && !TailMatchesCS1("\\lo*")) COMPLETE_WITH_QUERY(Query_for_list_of_databases); else if (TailMatchesCS1("\\password")) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Checksums by default?
On 01/21/2017 04:48 PM, Stephen Frost wrote: * Fujii Masao (masao.fu...@gmail.com) wrote: If the performance overhead by the checksums is really negligible, we may be able to get rid of wal_log_hints parameter, as well. Prior benchmarks showed it to be on the order of a few percent, as I recall, so I'm not sure that we can say it's negligible (and that's not why Magnus was proposing changing the default). It might be worth looking into using the CRC CPU instruction to reduce this overhead, like we do for the WAL checksums. Since that is a different algorithm it would be a compatibility break and we would need to support the old algorithm for upgraded clusters.. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 01/04/2017 04:14 PM, Stephen Frost wrote: * Andreas Karlsson (andr...@proxel.se) wrote: A possible solution might be to only add the error throwing hook when loading certificates during SIGHUP (and at Windows) and to work as before on startup. Would that be an acceptable solution? I could write a patch for this if people are interested. I'm not sure I see how that's a solution..? Wouldn't that mean that a SIGHUP with an encrypted key would result in a failure? The solution, at least in my view, seems to be to say "sorry, we can't reload the SSL stuff if you used a passphrase to unlock the key on startup, you will have to perform a restart if you want the SSL bits to be changed." Sorry, I was very unclear. I meant refusing the reload the SSL context if there is a pass phrase, but that the rest of the config will be reloaded just fine. This will lead to some log spam on every SIGHUP for people with a pass phrase but should otherwise work as before. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 01/04/2017 03:48 PM, Magnus Hagander wrote: On Wed, Jan 4, 2017 at 3:47 PM, Tom Lane
Re: [HACKERS] pg_sequence catalog
On 01/03/2017 03:30 PM, Peter Eisentraut wrote: On 1/3/17 7:23 AM, Kuntal Ghosh wrote: The regression tests for hot standby check fails since it uses the following statement: -select min_value as sequence_min_value from hsseq; which is no longer supported I guess. It should be modified as following: select min_value as sequence_min_value from pg_sequences where sequencename = 'hsseq'; Attached is a patch which reflects the above changes. Fixed, thanks. I made a note to self to port this test to the TAP framework. Hm, doesn't this change the intent of the test case? As I read the test it seems to make sure that we are allowed to do a read from a sequence relation on the slave. If so I think it should be changed to something like the below. select is_called from hsseq; Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why does plpython delay composite type resolution?
On 12/21/2016 04:14 AM, Jim Nasby wrote: Why do functions that accept composite types delay type resolution until execution? I have a naive patch that speeds up plpy.execute() by 8% by caching interred python strings for the dictionary key names (which are repeated over and over). The next step is to just pre-allocate those strings as appropriate for the calling context, but it's not clear how to handle that for input arguments. Does your patch handle "ALTER TYPE name ADD ATTRIBUTE ..."? My immediate guess would be that it could be a cache invalidation thing. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 12/04/2016 03:20 PM, Michael Paquier wrote: On Sun, Dec 4, 2016 at 11:11 PM, Andreas Karlsson <andr...@proxel.se> wrote: On 12/04/2016 02:12 PM, Michael Paquier wrote: One last thing that I think is missing in this patch is for users the possibility to check via SQL if the SSL context is actually loaded or not. As the context is reloaded after all the new values are available, with the current patch users may see that ssl is set to on but no context is loaded. So why not adding for example a read-only parameter that maps with SSLLoaded? The other three issues are easy to fix, but this one is a bit trickier. Do you mean that we should add another GUC here which is read-only? Yes, that's what I meant. It is hard to track if the reloading has been effective or not. Does this have a precedent in the code? data_checksums in guc.c is an example, it is marked with GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE and its value is updated with SetConfigOption() when the control file is read. The idea would be to do something like that with LoadedSSL. Thanks. I will be quite busy the upcoming couple of weeks so there will be a while until I can sit down with this. Feel free to contribute to the patch. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 12/04/2016 02:12 PM, Michael Paquier wrote: One last thing that I think is missing in this patch is for users the possibility to check via SQL if the SSL context is actually loaded or not. As the context is reloaded after all the new values are available, with the current patch users may see that ssl is set to on but no context is loaded. So why not adding for example a read-only parameter that maps with SSLLoaded? The other three issues are easy to fix, but this one is a bit trickier. Do you mean that we should add another GUC here which is read-only? Does this have a precedent in the code? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
I think this patch looks good now so I am setting it to ready for committer. I like the idea of the patch and I think that while this change will break some tools which look at the sequence relations I think the advantages are worth it (for example making more sequence DDL respecting MVCC). Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/30/2016 06:52 AM, Michael Paquier wrote: On Mon, Nov 28, 2016 at 2:01 PM, Michael Paquier Looking at the latest patch at code-level, there is some refactoring to introduce initialize_context(). But it is actually not necessary (perhaps this is the remnant of a past version?) as be_tls_init() is its only caller. This patch makes hard to look at the diffs, and it makes future back-patching more complicated, so I would suggest simplifying the patch as much as possible. You could use for example a goto block for the error code path to free the context with SSL_CTX_free(), and set up ssl_loaded_verify_locations once the CA is loaded. The same applies to initialize_ecdh(). + if (secure_initialize() != 0) + ereport(FATAL, + (errmsg("could not load ssl context"))); + LoadedSSL = true; In case of a failure, a LOG message would have been already generated, so this duplicates the information. Worse, if log_min_messages is set to a level higher than LOG, users *lose* information on what has happened. I think that secure_initialize() should have an argument to define elevel and that this routine should be in charge of generating an error message. Having it return a status code is necessary, but you could cast secure_initialize() with (void) to show that we don't care about the status code in this case. There is no need to care about freeing the new context when the FATAL code path is used as process would just shut down. Thanks, this is a really good suggestion which made the diff much cleaner. I removed my new macro too now since I felt it mostly made the code more cryptic just to gain a few lines of code. config.sgml needs to be updated to reflect that the SSL parameters are updated at server reload (mentioned already upthread, just re-mentioning it to group all the comments in one place). Thanks, fixed this. As this patch could be really simplified this way, I am marking is as returned with feedback. I have attached a new version. The commitfest should technically have been closed by now, so do what you like with the status. I can always submit the patch to the next commitfest. Andreas diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index b1c5289..5f80930 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -959,9 +959,8 @@ include_dir 'conf.d' Enables SSL connections. Please read before using this. The default -is off. This parameter can only be set at server -start. SSL communication is only possible with -TCP/IP connections. +is off. SSL communication is only +possible with TCP/IP connections. @@ -979,7 +978,7 @@ include_dir 'conf.d' and client certificate verification is not performed. (In previous releases of PostgreSQL, the name of this file was hard-coded as root.crt.) Relative paths are relative to the -data directory. This parameter can only be set at server start. +data directory. @@ -994,8 +993,7 @@ include_dir 'conf.d' Specifies the name of the file containing the SSL server certificate. The default is server.crt. Relative paths are -relative to the data directory. This parameter can only be set at -server start. +relative to the data directory. @@ -1012,8 +1010,7 @@ include_dir 'conf.d' revocation list (CRL). The default is empty, meaning no CRL file is loaded. (In previous releases of PostgreSQL, the name of this file was hard-coded as root.crl.) Relative paths are -relative to the data directory. This parameter can only be set at -server start. +relative to the data directory. @@ -1028,8 +1025,7 @@ include_dir 'conf.d' Specifies the name of the file containing the SSL server private key. The default is server.key. Relative paths are -relative to the data directory. This parameter can only be set at -server start. +relative to the data directory. diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 787cfce..5e78d81 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -2288,8 +2288,8 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433 The files server.key, server.crt, root.crt, and root.crl (or their configured alternative names) -are only examined during server start; so you must restart -the server for changes in them to take effect. +are examined when reloading the configuration, or when spawning the backend +process on Windows systems. diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 668f217..ebd00de 100644 --- a/src/backend/libpq/be-secure-openssl.c +++
Re: [HACKERS] pgcrypto compilation error due to stack-allocated EVP_CIPHER_CTX
On 12/01/2016 02:48 AM, Andres Freund wrote: It appears openssl has removed the public definition of EVP_CIPHER_CTX leading to pgcrypto failing with: Yes, I believe this is one of the changes in OpenSSL 1.1. I guess you might be the first one to try to compile with 1.1 since 5ff4a67f63fd6d3eb01ff9707d4674ed54a89f3b was pushed. If we do not already have it I think we should get a build farm animal with OpenSSL 1.1. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Broken SSL tests in master
On 11/25/2016 07:19 AM, Tsunakawa, Takayuki wrote: Specifying multiple hosts is a new feature to be introduced in v10, so that's here: https://www.postgresql.org/docs/devel/static/libpq-connect.html Thanks, I had missed that patch. If we add support for multiple hosts I think we should also allow for specifying the corresponding multiple hostaddrs. Another thought about this code: should we not check if it is a unix socket first before splitting the host? While I doubt that it is common to have a unix socket in a directory with comma in the path it is a bit annoying that we no longer support this. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Broken SSL tests in master
On 11/25/2016 06:11 AM, Tsunakawa, Takayuki wrote: However, I wonder whether the hostaddr parameter should also accept multiple IP addresses. Yeah, I too thought about if we should fix that. I feel like it would make sense to add support for multiple hostaddrs. For consitency's sake if nothing else. By the way is comma separated hosts documented somewhere? It is not included in https://www.postgresql.org/docs/9.6/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Broken SSL tests in master
On 11/24/2016 10:38 PM, Andreas Karlsson wrote: To me it feels like the proper fix would be to make PQHost() return the value of the host parameter rather than the hostaddr (maybe add a new field in the pg_conn_host struct). But would be a behaviour change which might break someones application. Thoughts? I have attached a proof of concept patch for this. Remaining work is investigating all the callers of PQhost() and see if any of them are negatively affected by this patch and cleaning up the code some. Andreas diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 3e9c45b..39c11eb 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -798,8 +798,34 @@ connectOptions2(PGconn *conn) */ if (conn->pghostaddr != NULL && conn->pghostaddr[0] != '\0') { - conn->connhost[0].host = strdup(conn->pghostaddr); - if (conn->connhost[0].host == NULL) + if (conn->pghost != NULL && conn->pghost[0] != '\0') + { + char *e = conn->pghost; + + /* + * Search for the end of the first hostname; a comma or + * end-of-string acts as a terminator. + */ + while (*e != '\0' && *e != ',') +++e; + + /* Copy the hostname whose bounds we just identified. */ + conn->connhost[0].host = +(char *) malloc(sizeof(char) * (e - conn->pghost + 1)); + if (conn->connhost[0].host == NULL) +goto oom_error; + memcpy(conn->connhost[0].host, conn->pghost, e - conn->pghost); + conn->connhost[0].host[e - conn->pghost] = '\0'; + } + else + { + conn->connhost[0].host = strdup(conn->pghostaddr); + if (conn->connhost[0].host == NULL) +goto oom_error; + } + + conn->connhost[0].hostaddr = strdup(conn->pghostaddr); + if (conn->connhost[0].hostaddr == NULL) goto oom_error; conn->connhost[0].type = CHT_HOST_ADDRESS; } @@ -827,6 +853,10 @@ connectOptions2(PGconn *conn) memcpy(conn->connhost[i].host, s, e - s); conn->connhost[i].host[e - s] = '\0'; + conn->connhost[i].hostaddr = strdup(conn->connhost[i].host); + if (conn->connhost[i].hostaddr == NULL) +goto oom_error; + /* Identify the type of host. */ conn->connhost[i].type = CHT_HOST_NAME; #ifdef HAVE_UNIX_SOCKETS @@ -845,12 +875,14 @@ connectOptions2(PGconn *conn) { #ifdef HAVE_UNIX_SOCKETS conn->connhost[0].host = strdup(DEFAULT_PGSOCKET_DIR); + conn->connhost[0].hostaddr = strdup(DEFAULT_PGSOCKET_DIR); conn->connhost[0].type = CHT_UNIX_SOCKET; #else conn->connhost[0].host = strdup(DefaultHost); + conn->connhost[0].hostaddr = strdup(DefaultHost); conn->connhost[0].type = CHT_HOST_NAME; #endif - if (conn->connhost[0].host == NULL) + if (conn->connhost[0].host == NULL || conn->connhost[0].hostaddr == NULL) goto oom_error; } @@ -1547,7 +1579,7 @@ connectDBStart(PGconn *conn) for (i = 0; i < conn->nconnhost; ++i) { pg_conn_host *ch = >connhost[i]; - char *node = ch->host; + char *node = ch->hostaddr; struct addrinfo hint; int thisport; @@ -3034,6 +3066,8 @@ freePGconn(PGconn *conn) { if (conn->connhost[i].host != NULL) free(conn->connhost[i].host); + if (conn->connhost[i].hostaddr != NULL) +free(conn->connhost[i].hostaddr); if (conn->connhost[i].port != NULL) free(conn->connhost[i].port); if (conn->connhost[i].password != NULL) diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h index 854ec89..19e3a5e 100644 --- a/src/interfaces/libpq/libpq-int.h +++ b/src/interfaces/libpq/libpq-int.h @@ -306,7 +306,10 @@ typedef enum pg_conn_host_type */ typedef struct pg_conn_host { - char *host; /* host name or address, or socket path */ + char *host; /* host name or address, or socket path, + * used for validating the hostname */ + char *hostaddr; /* host name or address, or socket path, + * used when actually connecting */ pg_conn_host_type type; /* type of host */ char *port; /* port number for this host; if not NULL, * overrrides the PGConn's pgport */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Broken SSL tests in master
Hi, The SSL test suite (src/test/ssl) is broken in the master since commit 9a1d0af4ad2cbd419115b453d811c141b80d872b, which is Robert's refactoring of getting the server hostname for GSS, SSPI, and SSL in libpq. The error we get in the test suite: # Running: psql -X -A -t -c SELECT 'connected with user=ssltestuser dbname=trustdb sslcert=invalid hostaddr=127.0.0.1 host=common-name.pg-ssltest.test sslrootcert=ssl/root+server_ca.crt sslmode=verify-full' -d user=ssltestuser dbname=trustdb sslcert=invalid hostaddr=127.0.0.1 host=common-name.pg-ssltest.test sslrootcert=ssl/root+server_ca.crt sslmode=verify-full psql: server certificate for "common-name.pg-ssltest.test" does not match host name "127.0.0.1" As you can see, after the patch libpq will now look at hostaddr rather than host when validating the server certificate because that is what is stored in the first (and only) entry of conn->connhost, and therefore what PQhost() return. To me it feels like the proper fix would be to make PQHost() return the value of the host parameter rather than the hostaddr (maybe add a new field in the pg_conn_host struct). But would be a behaviour change which might break someones application. Thoughts? Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/24/2016 02:49 PM, Andreas Karlsson wrote: Thanks for finding this. I will look at this more once I get home, but the tests do not fail on my computer. I wonder what I do differently. What versions of Perl and OpenSSL do you run and how did you run the tests when the failed? I ran the tests by running "make check" inside "src/test/ssl". Never mind, I had not fetched the latest master. Once I had done so these tests were both broken in my rebased branch and in the master branch without any modifications. I guess I will have to bisect this once I get home. Inof for myself or anyone else who feels like bisecting this: the last known good commit is 67dc4ccbb2e1c27da823eced66d9217a5652cbb0. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/24/2016 08:46 AM, Michael Paquier wrote: On Sat, Nov 12, 2016 at 3:42 AM, Andreas Karlsson <andr...@proxel.se> wrote: On 11/11/2016 07:40 PM, Andreas Karlsson wrote: Here is a new version of the patch with the only differences; 1) The SSL tests have been changed to use reload rather than restart Did you check if the tests pass? I am getting a couple of failures like this one: psql: server certificate for "common-name.pg-ssltest.test" does not match host name "127.0.0.1" not ok 11 - sslrootcert=ssl/root+server_ca.crt sslmode=verify-full Attached are the logs of the run I did, and the same behavior shows for macOS and Linux. The shape of the tests look correct to me after review. Still, seeing failing tests with sslmode=verify-full is a problem that needs to be addressed. This may be pointing to an incorrect CA load handling, though I could not spot a problem when going through the code. Thanks for finding this. I will look at this more once I get home, but the tests do not fail on my computer. I wonder what I do differently. What versions of Perl and OpenSSL do you run and how did you run the tests when the failed? I ran the tests by running "make check" inside "src/test/ssl". Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Contains and is contained by operators of inet datatypes
On 11/17/2016 11:14 PM, Tom Lane wrote: The original post proposed that we'd eventually get some benefit by being able to repurpose << and >> to mean something else, but the time scale over which that could happen is so long as to make it unlikely to ever happen. I think we'd need to deprecate these names for several years, then actually remove them and have nothing there for a few years more, before we could safely install new operators that take the same arguments but do something different. (For comparison's sake, it took us five years to go from deprecating => as a user operator to starting to use it as parameter naming syntax ... and that was a case where conflicting use could be expected to throw an error, not silently misbehave, so we could force it with little risk of silently breaking peoples' applications. To repurpose << and >> in this way we would need to move much slower.) I agree. The value in re-purposing them is pretty low given the long time scales needed before that can be done. I'm inclined to think we should just reject this patch. I'm certainly not going to commit it without seeing positive votes from multiple people. Given that I reviewed it I think you already have my vote on this. I like the patch because it means less operators to remember for me as a PostgreSQL user. And at least for me inet is a rarely used type compared to hstore, json and range types which all use @> and <@. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Contains and is contained by operators of inet datatypes
On 11/13/2016 01:21 PM, Emre Hasegeli wrote: Thank you for the review. New version is attached. Nice, I am fine with this version of the patch. Setting it to ready for committer! Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Contains and is contained by operators of inet datatypes
Review - Applies and passes the test suite. - I think this is a good change since it increases the consistency of the operators. I also like the choice of <<@ and @>> since they feel intuitive to me. - I tested it and both old and new operators use the brin and gist indexes. - The new spgist index does not support the old deprecated operators, which is intentional. I do not have a strong opinion here either way but some people may find this surprising. - I am not convinced that your changes to the descriptions of the operators necessarily make things clearer. For example "is contained by and smaller network (subnet)" only mentions subnets and not IP-addresses. - Maybe change "deprecated and will eventually be removed." to "deprecated and may be removed in a future release.". I prefer that latter wording but I am fine with either. - Won't renaming the functions which implement risk breaking people's applications? While the new names are a bit nicer I am not sure it is worth doing. - The changes to the code look generally good. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
Review for pg_sequence catalog I like this change since it moves all the parts which should be transactional to the system catalogs while keeping the only non-transactional stuff in the sequence relations. There was some discussion upthread about more compact representations for the sequences, but I feel that is a separate issue mostly unrelated to this patch. I might play around more with it but it seems to work well so far. As pointed out by Peter this patch also requires the changes to pg_upgrade. I have not looked at those patches. = Functional review - The patch applies and compiles and seems to work fine after some quick manual testing. - The pg_dump tests fails due to the pg_dump code not being updated. I have attached a patch which fixes this. = Benchmarks I was a bit worried that the extra syscache lookups might slow down nextval(), but I got a measurable speed up on a very simple workload which consisted of only calls to nextval() to one sequence. The speedup was about 10% on my machine. = Code The changes to the code looks generally good. > @@ -1155,6 +1156,8 @@ doDeletion(const ObjectAddress *object, int flags) > else > heap_drop_with_catalog(object->objectId); > } > + if (relKind == RELKIND_SEQUENCE) > + DeleteSequenceTuple(object->objectId); > break; > } I think it might be cleaner here to put this as a "else if" just like "relKind == RELKIND_INDEX". = Documentation The patch does not update catalogs.sgml which it should do. Andreas diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index ee1f673..a272ad3 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -15115,7 +15115,27 @@ dumpSequence(Archive *fout, TableInfo *tbinfo) snprintf(bufm, sizeof(bufm), INT64_FORMAT, SEQ_MINVALUE); snprintf(bufx, sizeof(bufx), INT64_FORMAT, SEQ_MAXVALUE); - if (fout->remoteVersion >= 80400) + if (fout->remoteVersion >= 10) + { + appendPQExpBuffer(query, + "SELECT relname, " + "seqstart, seqincrement, " + "CASE WHEN seqincrement > 0 AND seqmax = %s THEN NULL " + " WHEN seqincrement < 0 AND seqmax = -1 THEN NULL " + " ELSE seqmax " + "END AS seqmax, " + "CASE WHEN seqincrement > 0 AND seqmin = 1 THEN NULL " + " WHEN seqincrement < 0 AND seqmin = %s THEN NULL " + " ELSE seqmin " + "END AS seqmin, " + "seqcache, seqcycle " + "FROM pg_class c " + "JOIN pg_sequence s ON (s.seqrelid = c.oid) " + "WHERE relname = ", + bufx, bufm); + appendStringLiteralAH(query, tbinfo->dobj.name, fout); + } + else if (fout->remoteVersion >= 80400) { appendPQExpBuffer(query, "SELECT sequence_name, " -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/11/2016 07:40 PM, Andreas Karlsson wrote: Hi, Here is a new version of the patch with the only differences; 1) The SSL tests have been changed to use reload rather than restart 2) Rebased on master And here with a fix to a comment. Andreas diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 787cfce..5e78d81 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -2288,8 +2288,8 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433 The files server.key, server.crt, root.crt, and root.crl (or their configured alternative names) -are only examined during server start; so you must restart -the server for changes in them to take effect. +are examined when reloading the configuration, or when spawning the backend +process on Windows systems. diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 668f217..8449a53 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -77,12 +77,14 @@ static DH *generate_dh_parameters(int prime_len, int generator); static DH *tmp_dh_cb(SSL *s, int is_export, int keylength); static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); -static void initialize_ecdh(void); +static SSL_CTX *initialize_context(void); +static bool initialize_ecdh(SSL_CTX *context); static const char *SSLerrmessage(unsigned long ecode); static char *X509_NAME_to_cstring(X509_NAME *name); static SSL_CTX *SSL_context = NULL; +static bool SSL_initialized = false; /* */ /* Hardcoded values */ @@ -157,14 +159,12 @@ KWbuHn491xNO25CQWMtem80uKw+pTnisBRF/454n1Jnhub144YRBoN8CAQI=\n\ /* * Initialize global SSL context. */ -void +int be_tls_init(void) { - struct stat buf; + SSL_CTX *context; - STACK_OF(X509_NAME) *root_cert_list = NULL; - - if (!SSL_context) + if (!SSL_initialized) { #ifdef HAVE_OPENSSL_INIT_SSL OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL); @@ -173,177 +173,29 @@ be_tls_init(void) SSL_library_init(); SSL_load_error_strings(); #endif - - /* - * We use SSLv23_method() because it can negotiate use of the highest - * mutually supported protocol version, while alternatives like - * TLSv1_2_method() permit only one specific version. Note that we - * don't actually allow SSL v2 or v3, only TLS protocols (see below). - */ - SSL_context = SSL_CTX_new(SSLv23_method()); - if (!SSL_context) - ereport(FATAL, - (errmsg("could not create SSL context: %s", - SSLerrmessage(ERR_get_error(); - - /* - * Disable OpenSSL's moving-write-buffer sanity check, because it - * causes unnecessary failures in nonblocking send cases. - */ - SSL_CTX_set_mode(SSL_context, SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER); - - /* - * Load and verify server's certificate and private key - */ - if (SSL_CTX_use_certificate_chain_file(SSL_context, - ssl_cert_file) != 1) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("could not load server certificate file \"%s\": %s", - ssl_cert_file, SSLerrmessage(ERR_get_error(); - - if (stat(ssl_key_file, ) != 0) - ereport(FATAL, - (errcode_for_file_access(), - errmsg("could not access private key file \"%s\": %m", - ssl_key_file))); - - if (!S_ISREG(buf.st_mode)) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" is not a regular file", - ssl_key_file))); - - /* - * Refuse to load files owned by users other than us or root. - * - * XXX surely we can check this on Windows somehow, too. - */ -#if !defined(WIN32) && !defined(__CYGWIN__) - if (buf.st_uid != geteuid() && buf.st_uid != 0) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" must be owned by the database user or root", - ssl_key_file))); -#endif - - /* - * Require no public access to key file. If the file is owned by us, - * require mode 0600 or less. If owned by root, require 0640 or less - * to allow read access through our gid, or a supplementary gid that - * allows to read system-wide certificates. - * - * XXX temporarily suppress check when on Windows, because there may - * not be proper support for Unix-y file permissions. Need to think - * of a reasonable check to apply on Windows. (See also the data - * directory permission check in postmaster.c) - */ -#if !defined(WIN32) && !defined(__CYGWIN__) - if ((buf.st_uid == geteuid() && buf.st_mode & (S_IRWXG | S_IRWXO)) || - (buf.st_uid == 0 && buf.st_mode & (S_IWGRP | S_IXGRP | S_IRWXO))) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("priva
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
Hi, Here is a new version of the patch with the only differences; 1) The SSL tests have been changed to use reload rather than restart 2) Rebased on master Please take a look. Andreas diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 787cfce..5e78d81 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -2288,8 +2288,8 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433 The files server.key, server.crt, root.crt, and root.crl (or their configured alternative names) -are only examined during server start; so you must restart -the server for changes in them to take effect. +are examined when reloading the configuration, or when spawning the backend +process on Windows systems. diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 668f217..8449a53 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -77,12 +77,14 @@ static DH *generate_dh_parameters(int prime_len, int generator); static DH *tmp_dh_cb(SSL *s, int is_export, int keylength); static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); -static void initialize_ecdh(void); +static SSL_CTX *initialize_context(void); +static bool initialize_ecdh(SSL_CTX *context); static const char *SSLerrmessage(unsigned long ecode); static char *X509_NAME_to_cstring(X509_NAME *name); static SSL_CTX *SSL_context = NULL; +static bool SSL_initialized = false; /* */ /* Hardcoded values */ @@ -157,14 +159,12 @@ KWbuHn491xNO25CQWMtem80uKw+pTnisBRF/454n1Jnhub144YRBoN8CAQI=\n\ /* * Initialize global SSL context. */ -void +int be_tls_init(void) { - struct stat buf; + SSL_CTX *context; - STACK_OF(X509_NAME) *root_cert_list = NULL; - - if (!SSL_context) + if (!SSL_initialized) { #ifdef HAVE_OPENSSL_INIT_SSL OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL); @@ -173,177 +173,29 @@ be_tls_init(void) SSL_library_init(); SSL_load_error_strings(); #endif - - /* - * We use SSLv23_method() because it can negotiate use of the highest - * mutually supported protocol version, while alternatives like - * TLSv1_2_method() permit only one specific version. Note that we - * don't actually allow SSL v2 or v3, only TLS protocols (see below). - */ - SSL_context = SSL_CTX_new(SSLv23_method()); - if (!SSL_context) - ereport(FATAL, - (errmsg("could not create SSL context: %s", - SSLerrmessage(ERR_get_error(); - - /* - * Disable OpenSSL's moving-write-buffer sanity check, because it - * causes unnecessary failures in nonblocking send cases. - */ - SSL_CTX_set_mode(SSL_context, SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER); - - /* - * Load and verify server's certificate and private key - */ - if (SSL_CTX_use_certificate_chain_file(SSL_context, - ssl_cert_file) != 1) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("could not load server certificate file \"%s\": %s", - ssl_cert_file, SSLerrmessage(ERR_get_error(); - - if (stat(ssl_key_file, ) != 0) - ereport(FATAL, - (errcode_for_file_access(), - errmsg("could not access private key file \"%s\": %m", - ssl_key_file))); - - if (!S_ISREG(buf.st_mode)) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" is not a regular file", - ssl_key_file))); - - /* - * Refuse to load files owned by users other than us or root. - * - * XXX surely we can check this on Windows somehow, too. - */ -#if !defined(WIN32) && !defined(__CYGWIN__) - if (buf.st_uid != geteuid() && buf.st_uid != 0) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" must be owned by the database user or root", - ssl_key_file))); -#endif - - /* - * Require no public access to key file. If the file is owned by us, - * require mode 0600 or less. If owned by root, require 0640 or less - * to allow read access through our gid, or a supplementary gid that - * allows to read system-wide certificates. - * - * XXX temporarily suppress check when on Windows, because there may - * not be proper support for Unix-y file permissions. Need to think - * of a reasonable check to apply on Windows. (See also the data - * directory permission check in postmaster.c) - */ -#if !defined(WIN32) && !defined(__CYGWIN__) - if ((buf.st_uid == geteuid() && buf.st_mode & (S_IRWXG | S_IRWXO)) || - (buf.st_uid == 0 && buf.st_mode & (S_IWGRP | S_IXGRP | S_IRWXO))) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" has group or world access", - ssl_key_file), - errdetail("File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by
Re: [HACKERS] pg_sequence catalog
On 11/10/2016 06:27 AM, Andreas Karlsson wrote: On 11/10/2016 05:29 AM, Peter Eisentraut wrote: On 11/8/16 6:43 PM, Andreas Karlsson wrote: - Shouldn't last_value be NULL directly after we have created the sequence but nobody has called nextval() yet? - I noticed that last_value includes the cached values, but that also seems to me like the correct thing to do. The documentation now emphasizes that this is the value stored on disk. This matches what Oracle does. We also store is_called on disk, I think that if is_called is false then last_value should be NULL. Either that or we should add is_called. I will give the patch another review some time this week when i can find the time. Other than my comment above about is_called and last_value I think the patch looks great. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/10/2016 07:16 AM, Michael Paquier wrote: On Wed, Nov 9, 2016 at 7:46 PM, Andreas Karlsson <andr...@proxel.se> wrote: Those tests fail due to that listen_addresses cannot be changed on reload so none of the test cases can even connect to the database. When I hacked ServerSetup.pm to set the correct listen_address before starting all tests pass. Hm... listen_addresses remain constant at 127.0.0.1 and setting up listen_addresses = '*' does not work either.. Perhaps I am missing something? When PostgreSQL is started in the tests it by default only listens to a unix socket (except on Windows). It is the call to the restart function in the SSL tests which allows PostgreSQL to receive TCP connections. Fixing this in the SSL tests will require some refactoring. It is a bit annoying that if pg_hba.conf contains hostssl then postgres will refuse to start. Maybe this is something we should also fix in this patch since now when we can enable SSL after starting it becomes more useful to not bail on hostssl. What do you think? I forgot that... There is the same problem today when updating postgresql.conf and restarting the server if there is an hostssl entry. Do you have in mind to relax things? It seems to be that the safest bet is to not reload parameters if ssl is switched from on to off and if pg_hba.conf has a hostssl entry, right? That complicates the code though. I personally think that it would be cleaner and easier to understand if we just do not fail on hostssl lines just because SSL is disabled. A warning should be enough. But I do not have any strong opinions here, and would be fine with leaving the code as-is. I will look into writing a cleaner patch for ServerSetup.pm some time later this week. Thanks. Making the restart/reload OS-dependent will be necessary. src/test/ssl can run on Windows. I do not think that this will be an issue with the current design, but I do not have access to a Windows machine for testing. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
On 11/10/2016 05:29 AM, Peter Eisentraut wrote: On 11/8/16 6:43 PM, Andreas Karlsson wrote: - A worry is that it might get a bit confusing to have both the future catalog pg_sequence and the view pg_sequences. We already have this in other cases: pg_index/pg_indexes, pg_user_mapping/pg_user_mappings. It's an established naming system by now. Then I am fine with it. - I think it would be useful to include is_cycled in the view. It's there under the name "cycle". You are right, my bad, I managed to confuse myself. - Shouldn't last_value be NULL directly after we have created the sequence but nobody has called nextval() yet? - I noticed that last_value includes the cached values, but that also seems to me like the correct thing to do. The documentation now emphasizes that this is the value stored on disk. This matches what Oracle does. We also store is_called on disk, I think that if is_called is false then last_value should be NULL. Either that or we should add is_called. I will give the patch another review some time this week when i can find the time. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/09/2016 06:54 AM, Michael Paquier wrote: It seems to me that this patch is missing something... To begin with, src/test/ssl/ServerSetup.pm should be patched so as the new SSL configuration is reloaded after pg_ctl reload, and not after an instance restart. That's straight-forward: --- a/src/test/ssl/ServerSetup.pm +++ b/src/test/ssl/ServerSetup.pm @@ -96,7 +96,7 @@ sub configure_test_server_for_ssl close HBA; } -# Change the configuration to use given server cert file, and restart +# Change the configuration to use given server cert file, and reload # the server so that the configuration takes effect. sub switch_server_cert { @@ -115,6 +115,6 @@ sub switch_server_cert print SSLCONF "ssl_crl_file='root+client.crl'\n"; close SSLCONF; - # Stop and restart server to reload the new config. - $node->restart; + # Reload the new configuration set. + $node->reload; } Once I did that, half of the tests are failing. And I would have expected all of them to work properly. Those tests fail due to that listen_addresses cannot be changed on reload so none of the test cases can even connect to the database. When I hacked ServerSetup.pm to set the correct listen_address before starting all tests pass. It is a bit annoying that if pg_hba.conf contains hostssl then postgres will refuse to start. Maybe this is something we should also fix in this patch since now when we can enable SSL after starting it becomes more useful to not bail on hostssl. What do you think? I will look into writing a cleaner patch for ServerSetup.pm some time later this week. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_sequence catalog
Review of the pg_sequences view. This seems like a useful addition to me, making life easier for administrators and monitoring tools. While there is already a view in information_schema it is missing cache_size and last_value. = Functional review - The patch applies and passes the test suite without any issue. - A worry is that it might get a bit confusing to have both the future catalog pg_sequence and the view pg_sequences. - I think it would be useful to include is_cycled in the view. - When creating a temporary sequences and then running "SELECT * FROM pg_sequences" in another session I get the following error. ERROR: cannot access temporary tables of other sessions - Shouldn't last_value be NULL directly after we have created the sequence but nobody has called nextval() yet? - I noticed that last_value includes the cached values, but that also seems to me like the correct thing to do. - I do not like the name of the new function, lastval(regclass). I think like you suggested it would be better with something more verbose. sequence_lastval()? sequence_last_value()? = Code - There is an XXX comment still in the code. It is about the name of the lastval1() function. = Documentation - The documentation does not mention the last_value column. - The extra empty line after "" does not fit with the formatting of the rest of the SGML file. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
On 11/08/2016 01:22 PM, Michael Banck wrote: Thanks! I couldn't find furhter faults in my testing. I guess the question what to do about this on Windows is possibly still open, but as I am not familiar with the Windows port at all I've marked it Ready for Committer for now. Thanks again for the review! Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
Thanks for the review! I have fixed all your feedback in the attached patch. On 11/03/2016 04:24 PM, Michael Banck wrote: It does not update the documentation, I think at least section 18.9 "Secure TCP/IP Connections with SSL" needs updating as it says: "The files server.key, server.crt, root.crt, and root.crl (or their configured alternative names) are only examined during server start; so you must restart the server for changes in them to take effect". Changed this. However see below for segfaults during testing. Fixed, this was due to me not setting SSL_Context to NULL after freeing it. [...] I am not sure this '!!' operator is according to project policy, it seems to be not used elsewhere in the codebase except for imported or auto-generated code. At least there should be a comment here, methinks. I changed the code to compare with '\0' instead. [...] Missing semicolon at the end of the line. [...] Opening braces should be on the next line. [...] Same. Fixed. [...] All the delarations above this one are global variables for GUCs (see postmaster.h, lines 16-31). So ISTM this static variable declaration should be introduced by a comment in order to logically seperate it from the previous ones, like the following static variables are: Fixed. [...] This hunk baffled me at first because two lines below your added secure_destroy() declaration, the same line is already present. I did some digging and it turns out we had a secure_destroy() in the ancient past, but its implementation got removed in 2008 in 4e8162865 as there were no (more) users of it, however, the declaration was kept on until now. So this hunk should be removed I guess. Removed. Andreas diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 787cfce..5e78d81 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -2288,8 +2288,8 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433 The files server.key, server.crt, root.crt, and root.crl (or their configured alternative names) -are only examined during server start; so you must restart -the server for changes in them to take effect. +are examined when reloading the configuration, or when spawning the backend +process on Windows systems. diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 668f217..8449a53 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -77,12 +77,14 @@ static DH *generate_dh_parameters(int prime_len, int generator); static DH *tmp_dh_cb(SSL *s, int is_export, int keylength); static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); -static void initialize_ecdh(void); +static SSL_CTX *initialize_context(void); +static bool initialize_ecdh(SSL_CTX *context); static const char *SSLerrmessage(unsigned long ecode); static char *X509_NAME_to_cstring(X509_NAME *name); static SSL_CTX *SSL_context = NULL; +static bool SSL_initialized = false; /* */ /* Hardcoded values */ @@ -157,14 +159,12 @@ KWbuHn491xNO25CQWMtem80uKw+pTnisBRF/454n1Jnhub144YRBoN8CAQI=\n\ /* * Initialize global SSL context. */ -void +int be_tls_init(void) { - struct stat buf; + SSL_CTX *context; - STACK_OF(X509_NAME) *root_cert_list = NULL; - - if (!SSL_context) + if (!SSL_initialized) { #ifdef HAVE_OPENSSL_INIT_SSL OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL); @@ -173,177 +173,29 @@ be_tls_init(void) SSL_library_init(); SSL_load_error_strings(); #endif - - /* - * We use SSLv23_method() because it can negotiate use of the highest - * mutually supported protocol version, while alternatives like - * TLSv1_2_method() permit only one specific version. Note that we - * don't actually allow SSL v2 or v3, only TLS protocols (see below). - */ - SSL_context = SSL_CTX_new(SSLv23_method()); - if (!SSL_context) - ereport(FATAL, - (errmsg("could not create SSL context: %s", - SSLerrmessage(ERR_get_error(); - - /* - * Disable OpenSSL's moving-write-buffer sanity check, because it - * causes unnecessary failures in nonblocking send cases. - */ - SSL_CTX_set_mode(SSL_context, SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER); - - /* - * Load and verify server's certificate and private key - */ - if (SSL_CTX_use_certificate_chain_file(SSL_context, - ssl_cert_file) != 1) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("could not load server certificate file \"%s\": %s", - ssl_cert_file, SSLerrmessage(ERR_get_error(); - - if (stat(ssl_key_file, ) != 0) - ereport(FATAL, - (errcode_for_file_access(), - errmsg("could not access private key file \"%s\": %m", - ssl_key_file))); - - if (!S_ISREG(buf.st_mode)) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), -
Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
I have attached a version of the patch rebased on top of the OpenSSL 1.1 changes. Andreas diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 668f217..a1b582f 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -77,12 +77,14 @@ static DH *generate_dh_parameters(int prime_len, int generator); static DH *tmp_dh_cb(SSL *s, int is_export, int keylength); static int verify_cb(int, X509_STORE_CTX *); static void info_cb(const SSL *ssl, int type, int args); -static void initialize_ecdh(void); +static SSL_CTX *initialize_context(void); +static bool initialize_ecdh(SSL_CTX *context); static const char *SSLerrmessage(unsigned long ecode); static char *X509_NAME_to_cstring(X509_NAME *name); static SSL_CTX *SSL_context = NULL; +static bool SSL_initialized = false; /* */ /* Hardcoded values */ @@ -157,14 +159,12 @@ KWbuHn491xNO25CQWMtem80uKw+pTnisBRF/454n1Jnhub144YRBoN8CAQI=\n\ /* * Initialize global SSL context. */ -void +int be_tls_init(void) { - struct stat buf; + SSL_CTX *context; - STACK_OF(X509_NAME) *root_cert_list = NULL; - - if (!SSL_context) + if (!SSL_initialized) { #ifdef HAVE_OPENSSL_INIT_SSL OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL); @@ -173,177 +173,28 @@ be_tls_init(void) SSL_library_init(); SSL_load_error_strings(); #endif - - /* - * We use SSLv23_method() because it can negotiate use of the highest - * mutually supported protocol version, while alternatives like - * TLSv1_2_method() permit only one specific version. Note that we - * don't actually allow SSL v2 or v3, only TLS protocols (see below). - */ - SSL_context = SSL_CTX_new(SSLv23_method()); - if (!SSL_context) - ereport(FATAL, - (errmsg("could not create SSL context: %s", - SSLerrmessage(ERR_get_error(); - - /* - * Disable OpenSSL's moving-write-buffer sanity check, because it - * causes unnecessary failures in nonblocking send cases. - */ - SSL_CTX_set_mode(SSL_context, SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER); - - /* - * Load and verify server's certificate and private key - */ - if (SSL_CTX_use_certificate_chain_file(SSL_context, - ssl_cert_file) != 1) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("could not load server certificate file \"%s\": %s", - ssl_cert_file, SSLerrmessage(ERR_get_error(); - - if (stat(ssl_key_file, ) != 0) - ereport(FATAL, - (errcode_for_file_access(), - errmsg("could not access private key file \"%s\": %m", - ssl_key_file))); - - if (!S_ISREG(buf.st_mode)) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" is not a regular file", - ssl_key_file))); - - /* - * Refuse to load files owned by users other than us or root. - * - * XXX surely we can check this on Windows somehow, too. - */ -#if !defined(WIN32) && !defined(__CYGWIN__) - if (buf.st_uid != geteuid() && buf.st_uid != 0) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" must be owned by the database user or root", - ssl_key_file))); -#endif - - /* - * Require no public access to key file. If the file is owned by us, - * require mode 0600 or less. If owned by root, require 0640 or less - * to allow read access through our gid, or a supplementary gid that - * allows to read system-wide certificates. - * - * XXX temporarily suppress check when on Windows, because there may - * not be proper support for Unix-y file permissions. Need to think - * of a reasonable check to apply on Windows. (See also the data - * directory permission check in postmaster.c) - */ -#if !defined(WIN32) && !defined(__CYGWIN__) - if ((buf.st_uid == geteuid() && buf.st_mode & (S_IRWXG | S_IRWXO)) || - (buf.st_uid == 0 && buf.st_mode & (S_IWGRP | S_IXGRP | S_IRWXO))) - ereport(FATAL, - (errcode(ERRCODE_CONFIG_FILE_ERROR), - errmsg("private key file \"%s\" has group or world access", - ssl_key_file), - errdetail("File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root."))); -#endif - - if (SSL_CTX_use_PrivateKey_file(SSL_context, - ssl_key_file, - SSL_FILETYPE_PEM) != 1) - ereport(FATAL, - (errmsg("could not load private key file \"%s\": %s", - ssl_key_file, SSLerrmessage(ERR_get_error(); - - if (SSL_CTX_check_private_key(SSL_context) != 1) - ereport(FATAL, - (errmsg("check of private key failed: %s", - SSLerrmessage(ERR_get_error(); - } - - /* set up ephemeral DH keys, and disallow SSL v2/v3 while at it */ - SSL_CTX_set_tmp_dh_callback(SSL_context, tmp_dh_cb); - SSL_CTX_set_options(SSL_context, - SSL_OP_SINGLE_DH_USE | - SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3); - - /* set
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 09/16/2016 04:11 PM, Christoph Berg wrote: Thanks for the patch! I just tried to apply it to 9.2. There was a conflict in configure.in which was trivial to resolve. Another conflict in contrib/pgcrypto/pgcrypto.c was not applicable because the code doesn't seem to exist (didn't try very hard though). Ignoring the contrib conflict, it still didn't compile: /home/cbe/projects/postgresql/debian/9.2/build/../src/backend/libpq/be-secure.c: In function ‘secure_write’: /home/cbe/projects/postgresql/debian/9.2/build/../src/backend/libpq/be-secure.c:342:17: error: dereferencing pointer to incomplete type ‘SSL {aka struct ssl_st}’ if (port->ssl->state != SSL_ST_OK) ^~ /home/cbe/projects/postgresql/debian/9.2/build/../src/backend/libpq/be-secure.c:342:28: error: ‘SSL_ST_OK’ undeclared (first use in this function) if (port->ssl->state != SSL_ST_OK) ^ This is related to the renegotiation which was first fixed and later removed in the 9.4 cycle, but intentionally not backported. It seems like OpenSSL refactored the state machine in 1.1 which is why the code above breaks. I am not entirely sure I follow what the old code in 9.3 and 9.2 is strying to do and why it messes directly with the state of the statemachine. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 09/15/2016 05:38 PM, Alvaro Herrera wrote: I suppose some interested party could grab the patch that Heikki committed to the new branches and produce a back-patch that can be applied to the older branches. Here is the result of backporting the sum of the two patches on top of REL9_4_STABLE. Not sure if we need this, but if we do we can apply this patch. Andreas diff --git a/configure b/configure index 6c6c08d..9470ed1 100755 --- a/configure +++ b/configure @@ -8621,9 +8621,9 @@ else as_fn_error $? "library 'crypto' is required for OpenSSL" "$LINENO" 5 fi - { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_library_init in -lssl" >&5 -$as_echo_n "checking for SSL_library_init in -lssl... " >&6; } -if ${ac_cv_lib_ssl_SSL_library_init+:} false; then : + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_new in -lssl" >&5 +$as_echo_n "checking for SSL_new in -lssl... " >&6; } +if ${ac_cv_lib_ssl_SSL_new+:} false; then : $as_echo_n "(cached) " >&6 else ac_check_lib_save_LIBS=$LIBS @@ -8637,27 +8637,27 @@ cat confdefs.h - <<_ACEOF >conftest.$ac_ext #ifdef __cplusplus extern "C" #endif -char SSL_library_init (); +char SSL_new (); int main () { -return SSL_library_init (); +return SSL_new (); ; return 0; } _ACEOF if ac_fn_c_try_link "$LINENO"; then : - ac_cv_lib_ssl_SSL_library_init=yes + ac_cv_lib_ssl_SSL_new=yes else - ac_cv_lib_ssl_SSL_library_init=no + ac_cv_lib_ssl_SSL_new=no fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_ssl_SSL_library_init" >&5 -$as_echo "$ac_cv_lib_ssl_SSL_library_init" >&6; } -if test "x$ac_cv_lib_ssl_SSL_library_init" = xyes; then : +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_ssl_SSL_new" >&5 +$as_echo "$ac_cv_lib_ssl_SSL_new" >&6; } +if test "x$ac_cv_lib_ssl_SSL_new" = xyes; then : cat >>confdefs.h <<_ACEOF #define HAVE_LIBSSL 1 _ACEOF @@ -8727,9 +8727,9 @@ else as_fn_error $? "library 'eay32' or 'crypto' is required for OpenSSL" "$LINENO" 5 fi - { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing SSL_library_init" >&5 -$as_echo_n "checking for library containing SSL_library_init... " >&6; } -if ${ac_cv_search_SSL_library_init+:} false; then : + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing SSL_new" >&5 +$as_echo_n "checking for library containing SSL_new... " >&6; } +if ${ac_cv_search_SSL_new+:} false; then : $as_echo_n "(cached) " >&6 else ac_func_search_save_LIBS=$LIBS @@ -8742,11 +8742,11 @@ cat confdefs.h - <<_ACEOF >conftest.$ac_ext #ifdef __cplusplus extern "C" #endif -char SSL_library_init (); +char SSL_new (); int main () { -return SSL_library_init (); +return SSL_new (); ; return 0; } @@ -8759,25 +8759,25 @@ for ac_lib in '' ssleay32 ssl; do LIBS="-l$ac_lib $ac_func_search_save_LIBS" fi if ac_fn_c_try_link "$LINENO"; then : - ac_cv_search_SSL_library_init=$ac_res + ac_cv_search_SSL_new=$ac_res fi rm -f core conftest.err conftest.$ac_objext \ conftest$ac_exeext - if ${ac_cv_search_SSL_library_init+:} false; then : + if ${ac_cv_search_SSL_new+:} false; then : break fi done -if ${ac_cv_search_SSL_library_init+:} false; then : +if ${ac_cv_search_SSL_new+:} false; then : else - ac_cv_search_SSL_library_init=no + ac_cv_search_SSL_new=no fi rm conftest.$ac_ext LIBS=$ac_func_search_save_LIBS fi -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_SSL_library_init" >&5 -$as_echo "$ac_cv_search_SSL_library_init" >&6; } -ac_res=$ac_cv_search_SSL_library_init +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_SSL_new" >&5 +$as_echo "$ac_cv_search_SSL_new" >&6; } +ac_res=$ac_cv_search_SSL_new if test "$ac_res" != no; then : test "$ac_res" = "none required" || LIBS="$ac_res $LIBS" @@ -8797,6 +8797,37 @@ _ACEOF fi done + # Functions introduced in OpenSSL 1.1.0. We used to check for + # OPENSSL_VERSION_NUMBER, but that didn't work with 1.1.0, because LibreSSL + # defines OPENSSL_VERSION_NUMBER to claim version 2.0.0, even though it + # doesn't have these OpenSSL 1.1.0 functions. So check for individual + # functions. + for ac_func in OPENSSL_init_ssl BIO_get_data BIO_meth_new ASN1_STRING_get0_data RAND_OpenSSL +do : + as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh` +ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var" +if eval test \"x\$"$as_ac_var"\" = x"yes"; then : + cat >>confdefs.h <<_ACEOF +#define `$as_echo "HAVE_$ac_func" | $as_tr_cpp` 1 +_ACEOF + +fi +done + + # OpenSSL versions before 1.1.0 required setting callback functions, for + # thread-safety. In 1.1.0, it's no longer required, and CRYPTO_lock() + # function was removed. + for ac_func in CRYPTO_lock +do : + ac_fn_c_check_func "$LINENO" "CRYPTO_lock" "ac_cv_func_CRYPTO_lock" +if test "x$ac_cv_func_CRYPTO_lock" = xyes; then : + cat
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 09/15/2016 02:03 AM, Andreas Karlsson wrote: On 09/12/2016 06:51 PM, Heikki Linnakangas wrote: Changes since last version: * Added more error checks to the my_BIO_s_socket() function. Check for NULL result from malloc(). Check the return code of BIO_meth_set_*() functions; looking at OpenSSL sources, they always succeed, but all the test/example programs that come with OpenSSL do check them. * Use BIO_get_new_index() to get the index number for the wrapper BIO. * Also call BIO_meth_set_puts(). It was missing in previous patch versions. * Fixed src/test/ssl test suite to also work with OpenSSL 1.1.0. * Changed all references (in existing code) to SSLEAY_VERSION_NUMBER into OPENSSL_VERSION_NUMBER, for consistency. * Squashed all into one patch. I intend to apply this to all supported branches, so please have a look! This is now against REL9_6_STABLE, but there should be little difference between branches in the code that this touches. This patch no longer seems to apply to head after the removed support of 0.9.6. Is that intentional? Never mind. I just failed at reading. Now for a review: It looks generally good but I think I saw one error. In fe-secure-openssl.c your code still calls SSL_library_init() in OpenSSL 1.1. I think it should be enough to just call OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL) like you do in be-secure. Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers