Re: [HACKERS] WAL consistency check facility
On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggswrote: > > I think you should add this as part of the default testing for both > check and installcheck. I can't imagine why we'd have it and not use > it during testing. > The actual consistency checks are done during redo (replay), so not sure whats in you mind for enabling it with check or installcheck. I think we can run few recovery/replay tests with this framework. Also running the tests under this framework could be time-consuming as it logs the entire page for each WAL record we write and then during replay reads the same. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 08/26/2016 07:04 PM, Heikki Linnakangas wrote: On 08/26/2016 07:44 PM, Tom Lane wrote: Peter Eisentrautwrites: On 8/26/16 5:31 AM, Heikki Linnakangas wrote: I think now would be a good time to drop support for OpenSSL versions older than 0.9.8. OpenSSL don't even support 0.9.8 anymore, although there are probably distributions out there that still provide patches for it. But OpenSSL 0.9.7 and older are really not interesting for PostgreSQL 10 anymore, I think. CentOS 5 currently ships 0.9.8e. That's usually the oldest OS we want to support eagerly. Also, I get this on fully-up-to-date OS X (El Capitan): $ openssl version OpenSSL 0.9.8zh 14 Jan 2016 Ok, sold, let's remove support for OpenSSL < 0.9.8. I have attached a patch which removes the < 0.9.8 compatibility code. Should we also add a version check to configure? We do not have any such check currently. Andreas diff --git a/contrib/pgcrypto/openssl.c b/contrib/pgcrypto/openssl.c index 976af70..ffab5d2 100644 --- a/contrib/pgcrypto/openssl.c +++ b/contrib/pgcrypto/openssl.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -47,155 +48,6 @@ #define MAX_IV (128/8) /* - * Compatibility with OpenSSL 0.9.6 - * - * It needs AES and newer DES and digest API. - */ -#if OPENSSL_VERSION_NUMBER >= 0x00907000L - -/* - * Nothing needed for OpenSSL 0.9.7+ - */ - -#include -#else /* old OPENSSL */ - -/* - * Emulate OpenSSL AES. - */ - -#include "rijndael.c" - -#define AES_ENCRYPT 1 -#define AES_DECRYPT 0 -#define AES_KEY rijndael_ctx - -static int -AES_set_encrypt_key(const uint8 *key, int kbits, AES_KEY *ctx) -{ - aes_set_key(ctx, key, kbits, 1); - return 0; -} - -static int -AES_set_decrypt_key(const uint8 *key, int kbits, AES_KEY *ctx) -{ - aes_set_key(ctx, key, kbits, 0); - return 0; -} - -static void -AES_ecb_encrypt(const uint8 *src, uint8 *dst, AES_KEY *ctx, int enc) -{ - memcpy(dst, src, 16); - if (enc) - aes_ecb_encrypt(ctx, dst, 16); - else - aes_ecb_decrypt(ctx, dst, 16); -} - -static void -AES_cbc_encrypt(const uint8 *src, uint8 *dst, int len, AES_KEY *ctx, uint8 *iv, int enc) -{ - memcpy(dst, src, len); - if (enc) - { - aes_cbc_encrypt(ctx, iv, dst, len); - memcpy(iv, dst + len - 16, 16); - } - else - { - aes_cbc_decrypt(ctx, iv, dst, len); - memcpy(iv, src + len - 16, 16); - } -} - -/* - * Emulate DES_* API - */ - -#define DES_key_schedule des_key_schedule -#define DES_cblock des_cblock -#define DES_set_key(k, ks) \ - des_set_key((k), *(ks)) -#define DES_ecb_encrypt(i, o, k, e) \ - des_ecb_encrypt((i), (o), *(k), (e)) -#define DES_ncbc_encrypt(i, o, l, k, iv, e) \ - des_ncbc_encrypt((i), (o), (l), *(k), (iv), (e)) -#define DES_ecb3_encrypt(i, o, k1, k2, k3, e) \ - des_ecb3_encrypt((des_cblock *)(i), (des_cblock *)(o), \ -*(k1), *(k2), *(k3), (e)) -#define DES_ede3_cbc_encrypt(i, o, l, k1, k2, k3, iv, e) \ - des_ede3_cbc_encrypt((i), (o), \ -(l), *(k1), *(k2), *(k3), (iv), (e)) - -/* - * Emulate newer digest API. - */ - -static void -EVP_MD_CTX_init(EVP_MD_CTX *ctx) -{ - memset(ctx, 0, sizeof(*ctx)); -} - -static int -EVP_MD_CTX_cleanup(EVP_MD_CTX *ctx) -{ - px_memset(ctx, 0, sizeof(*ctx)); - return 1; -} - -static int -EVP_DigestInit_ex(EVP_MD_CTX *ctx, const EVP_MD *md, void *engine) -{ - EVP_DigestInit(ctx, md); - return 1; -} - -static int -EVP_DigestFinal_ex(EVP_MD_CTX *ctx, unsigned char *res, unsigned int *len) -{ - EVP_DigestFinal(ctx, res, len); - return 1; -} -#endif /* old OpenSSL */ - -/* - * Provide SHA2 for older OpenSSL < 0.9.8 - */ -#if OPENSSL_VERSION_NUMBER < 0x00908000L - -#include "sha2.c" -#include "internal-sha2.c" - -typedef void (*init_f) (PX_MD *md); - -static int -compat_find_digest(const char *name, PX_MD **res) -{ - init_f init = NULL; - - if (pg_strcasecmp(name, "sha224") == 0) - init = init_sha224; - else if (pg_strcasecmp(name, "sha256") == 0) - init = init_sha256; - else if (pg_strcasecmp(name, "sha384") == 0) - init = init_sha384; - else if (pg_strcasecmp(name, "sha512") == 0) - init = init_sha512; - else - return PXE_NO_HASH; - - *res = px_alloc(sizeof(PX_MD)); - init(*res); - return 0; -} -#else -#define compat_find_digest(name, res) (PXE_NO_HASH) -#endif - -/* * Hashes */ @@ -275,7 +127,7 @@ px_find_digest(const char *name, PX_MD **res) md = EVP_get_digestbyname(name); if (md == NULL) - return compat_find_digest(name, res); + return PXE_NO_HASH; digest = px_alloc(sizeof(*digest)); digest->algo = md; diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index a996875..2420387 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -2827,30 +2827,6 @@ MANPATH=/usr/lib/scohelp/%L/man:/usr/dt/man:/usr/man:/usr/share/man:scohelp:/usr -Problems with OpenSSL - - - When you build PostgreSQL with OpenSSL support you might get - compilation errors in the following files: - -
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
2016-08-26 19:37 GMT-03:00 Tom Lane: > =?UTF-8?B?TWFydMOtbiBNYXJxdcOpcw==?= writes: >> Looking at this issue today, I found that we are not setting a >> dependency for an index created inside an extension. > > Surely the index has a dependency on a table, which depends on the > extension? > > If you mean that you want an extension to create an index on a table that > doesn't belong to it, but it's assuming pre-exists, I think that's just > stupid and we need not support it. Well, there's still the second pattern I mentioned before (which actually came up while trying to fix this patch). Extension creates a table and an index over one of the columns: CREATE TABLE regress_pg_dump_schema.test_table ( col1 int, col2 int ); CREATE INDEX test_extension_index ON regress_pg_dump_schema.test_table (col2); Later, some application (or a user, doesn't matter really) creates a second index over col1: CREATE INDEX test_index ON regress_pg_dump_schema.test_table (col1); What we are doing (or at least it's what I understand from the code) is checking if the table depends on an extension, and so we don't dump it. We should be able to use the same procedure (and reuse the code we already have) to decide if an index should be dumped or not. But we are missing the dependency, and so it's not possible to know that regress_pg_dump_schema.test_extension_index depends on the extension and regress_pg_dump_schema.test_index doesn't. Or is this something we shouldn't support (in that case we should document it). -- Martín Marquéshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
On 08/26/2016 04:15 PM, Tomas Vondra wrote: On 08/27/2016 12:37 AM, Tom Lane wrote: =?UTF-8?B?TWFydMOtbiBNYXJxdcOpcw==?=writes: Looking at this issue today, I found that we are not setting a dependency for an index created inside an extension. Surely the index has a dependency on a table, which depends on the extension? If you mean that you want an extension to create an index on a table that doesn't belong to it, but it's assuming pre-exists, I think that's just stupid and we need not support it. I don't see why that would be stupid (and I'm not sure it's up to us to just decide it's stupid). +1, there are a lot of workflow patterns that make sense and don't make sense depending on your perspective, consider safe mode with MySQL. I personally think it is stupid too but it also obviously has a huge useful user base. Imagine you use extensions to break the application into modules. You have a "basic" extension, with the table, and a "search" extension implementing fulltext search on top of that table. Isn't it natural to keep the gin indexes in the search extension? So the basic--1.0.sql will do something like CREATE TABLE x ( ...) and search--1.0.sql would do CREATE INDEX y ON x USING gin (z); CREATE FUNCTION search_objects(..) ... because the index and function doing the search are somewhat tightly coupled. I admit this is just an example and I don't know how many people use extensions this way, but I don't see why this would be inherently stupid pattern? It isn't and in fact shows that as we extend Pg, the more we allow people flexibility in developing WITHIN the database, the more we will be able to influence them to do so. That is a good thing. Or maybe we should just tell everyone, use an ORM it will solve all your problems. (/sarcasm) Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
Tomas Vondrawrites: > On 08/27/2016 12:37 AM, Tom Lane wrote: >> If you mean that you want an extension to create an index on a table >> that doesn't belong to it, but it's assuming pre-exists, I think >> that's just stupid and we need not support it. > I don't see why that would be stupid (and I'm not sure it's up to us to > just decide it's stupid). Well, think about it. 1. Let's say user A owns the pre-existing table and user B owns the extension. Who owns the index? 2. Generally speaking, if an object is part of an extension then you can't drop the object without dropping the whole extension. This means that either user A can't drop his own table, or he can but that causes dropping the whole extension (which he does not own), not to mention whatever else depends on it (which he owns even less). 3. Can user A do something that would mutate the index? (For instance, ALTER COLUMN TYPE on one of the columns it's on.) Now is it still a member of the extension? If user A can't, can user B? What if either of them mutated the column to a datatype that belongs to some extension that has a dependency on the original one? Now you've got a circular dependency loop, what fun --- especially at dump/reload time, where pg_dump has no hope of breaking the circularity. 4. If we're going to support indexes as independent members of extensions, why not foreign-key constraints, or check constraints? Maybe check constraints on domain types that don't belong to the extension? Heck, why not allow an extension to do ALTER TABLE ADD COLUMN? (I think BTW that several of these cases would allow creation of circular extension dependencies even without any ALTER COLUMN TYPE shenanigans.) None of this makes any sense, because these things are not stand-alone objects in any sense of the word. They are attributes of tables (or domain types, for that case) and there are good reasons why we don't consider such attributes to have ownership independent of the object they are part of. I do not think it is sensible for an extension to "own" objects that don't also have, or could potentially have [1], independent ownership in the sense of an owning user. > Imagine you use extensions to break the application into modules. I do not think that extensions as we currently understand them are a suitable basis for slicing an application in the way you suggest. I'm fine with an extension owning a whole table, but the rest of this is just crazy. And I sure as hell do not want to put it in as part of a bug fix for existing releases. regards, tom lane [1] weasel wording for cases like casts, which we don't consider to have owners, though I'm not sure why not. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
On 08/27/2016 12:37 AM, Tom Lane wrote: =?UTF-8?B?TWFydMOtbiBNYXJxdcOpcw==?=writes: Looking at this issue today, I found that we are not setting a dependency for an index created inside an extension. Surely the index has a dependency on a table, which depends on the extension? If you mean that you want an extension to create an index on a table that doesn't belong to it, but it's assuming pre-exists, I think that's just stupid and we need not support it. I don't see why that would be stupid (and I'm not sure it's up to us to just decide it's stupid). Imagine you use extensions to break the application into modules. You have a "basic" extension, with the table, and a "search" extension implementing fulltext search on top of that table. Isn't it natural to keep the gin indexes in the search extension? So the basic--1.0.sql will do something like CREATE TABLE x ( ...) and search--1.0.sql would do CREATE INDEX y ON x USING gin (z); CREATE FUNCTION search_objects(..) ... because the index and function doing the search are somewhat tightly coupled. I admit this is just an example and I don't know how many people use extensions this way, but I don't see why this would be inherently stupid pattern? I see the current behavior is documented, and I do understand why global objects can't be part of the extension, but for indexes it seems to violate POLA a bit. Is there a reason why we don't want the extension/index dependencies? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
=?UTF-8?B?TWFydMOtbiBNYXJxdcOpcw==?=writes: > Looking at this issue today, I found that we are not setting a > dependency for an index created inside an extension. Surely the index has a dependency on a table, which depends on the extension? If you mean that you want an extension to create an index on a table that doesn't belong to it, but it's assuming pre-exists, I think that's just stupid and we need not support it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
Hi, 2016-08-26 10:53 GMT-03:00 Martín Marqués: > > There's still one issue, which I'll add a test for as well, which is > that if the index was created by the extension, it will be dumped > anyway. I'll have a look at that as well. Looking at this issue today, I found that we are not setting a dependency for an index created inside an extension. I don't know if it's deliberate or an overlook. The thing is that we can't check pg_depend for dependency of an index and the extension that creates it. I was talking with other developers, and we kind of agree this is a bug, for 2 reasons we thought of: *) If the extension only creates an index over an existing table, a drop extension will not drop that index *) We need to have the dependency for this patch as well, else we'll end up with an inconsistent dump, or at least one that could restore with a != 0 return error code. I'll open a separate bug report for this. Regards, -- Martín Marquéshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 8/26/16 12:26 PM, Euler Taveira wrote: > initdb: we already have 'pg_ctl init' (since 9.0) and could remove initdb. I have a concern specifically about pg_ctl. Depending on how your PostgreSQL is packaged, not all of the pg_ctl actions are safe or sensible to run. For example, if you are using systemd, then using pg_ctl restart will probably not do the right thing. And depending on SELinux (maybe), running initdb directly might also not do the right thing. In most cases, you can just not use pg_ctl and do all these things differently, but for example pg_ctl promote is the only documented way to do that thing. Until we have a better way to figure that out, I wouldn't want to put more things into pg_ctl, especially as the only way. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 2016-08-26 17:31:14 -0400, Peter Eisentraut wrote: > I agree with all that. But the subject line is specifically about > moving pg_xlog. So if your opinion is that we shouldn't move pg_xlog, > then that is noted. But if we were to move it, we can think about a > good place to move it to. I think it's probably worth moving pg_xlog, because the benefit also includes preventing a few users from shooting themselves somewhere vital. That's imo much less the case for some of the other moves. But I still don't think think a largescale reorganization is a good idea, it'll just stall and nothing will happen. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 8/26/16 5:20 PM, Andres Freund wrote: > I do think there's an order of magnitude between the impact between > moving some and moving everything. And that's going to impact > cost/benefit calculations. > > Moving e.g. all ephemeral files into a (possibly configurable) directory > is going to hardly impact anyone. Renaming pg_logical into something > different (FWIW, it was originally named differently...) will hopefully > impact nobody, excepting some out of date file exclusion lists possibly. > > But moving config files, and even pg_xlog (which we document to be > symlinkable somewhere else) imo is different. I agree with all that. But the subject line is specifically about moving pg_xlog. So if your opinion is that we shouldn't move pg_xlog, then that is noted. But if we were to move it, we can think about a good place to move it to. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 2016-08-26 17:11:00 -0400, Peter Eisentraut wrote: > On 8/26/16 12:28 PM, Tom Lane wrote: > > Also, I'd just as soon not move/rename things > > that don't really need it. > > I'm just as happy with not changing anything. But if we're going to > rename stuff, let's at least think about something slightly more > comprehensive. Any rename is going to break a bunch of stuff. But if > we break it in a way that reduces the need for future discussion or > changes, it would at least be a small win in the long run. I do think there's an order of magnitude between the impact between moving some and moving everything. And that's going to impact cost/benefit calculations. Moving e.g. all ephemeral files into a (possibly configurable) directory is going to hardly impact anyone. Renaming pg_logical into something different (FWIW, it was originally named differently...) will hopefully impact nobody, excepting some out of date file exclusion lists possibly. But moving config files, and even pg_xlog (which we document to be symlinkable somewhere else) imo is different. The other thing is that the likelihood of getting anywhere by doing radical one-off redesigns is approximately 0. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 2016-08-26 13:26:39 -0300, Euler Taveira wrote: > I'm bringing this $subject into discussion again. Historically, we are > carrying binary names that have been confused newbies. createuser is the > worst name so for. Also, names like createdb, initdb, reindexdb, and > droplang does not suggest what product it is referring to. Adding a > prefix (pg_, pg, ...) would 'make things clear'. If we have a consensus > about this change, I suggest renaming the following binaries: > > clusterdb > createdb > createlang > createuser > dropdb > droplang > dropuser > initdb > oid2name > reindexdb > vacuumdb > vacuumlo Uhm, that'd need a careful backward compatibility plan, including a period of supporting both names. > Another major change related to this topic is assemble functionalities > from binaries. We currently have 34 binaries > (is that a lot for a single software?). Does it matter? The few bytes of disk space are essentially irrelevant. That said, on the code level, there'd be considerable benefit of coalescing: > pg_command: clusterdb, createdb, dropdb, createuser, dropuser, > createlang, droplang, reindexdb, vacuumdb, vacuumlo. these commands. We could do the old trick of leaving the old names as in place as symlinks. > initdb: we already have 'pg_ctl init' (since 9.0) and could remove > initdb. I fairly strongly against removing initdb as a separate binary. Issuies during cluster creation are already annoying to debug. This is a *significant* amount of work, it'll make backpatching a nightmare (although not that much happens in these binaries). I personally see better uses of my time. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is a newly created index contains the invalid LSN?
On 2016-08-26 18:46:42 +0300, Yury Zhuravlev wrote: > Thanks all. > Now understand LSN strongly connected with WAL. > However how difficult put last system LSN instead 0? > It's not so important but will allow make use LSN more consistent. Maybe explain why you're interested in page lsns, that'd perhaps allow us to give more meaningful feedback. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 8/26/16 4:02 PM, Joshua D. Drake wrote: > Although... wouldn't run be under var? Traditionally yes, but things are changing in this area, if you consider the top-level file system of a modern Linux distribution. One reason is that "run" is/can be blown away at reboot. This wouldn't be an entirely useless feature for postgres: Can you tell otherwise which of the various pid/lock/opts files you can delete if you have killed the processes and want to eliminate any left-over state? Is postmaster.opts a configuration file or a state file? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is a newly created index contains the invalid LSN?
On Fri, Aug 26, 2016 at 10:46 AM, Yury Zhuravlevwrote: > Now understand LSN strongly connected with WAL. > However how difficult put last system LSN instead 0? > It's not so important but will allow make use LSN more consistent. There might be performance implications when flushing the index buffers, due to the need to check each one against the WAL flush point, where we now skip that check. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 8/26/16 12:28 PM, Tom Lane wrote: > Also, I'd just as soon not move/rename things > that don't really need it. I'm just as happy with not changing anything. But if we're going to rename stuff, let's at least think about something slightly more comprehensive. Any rename is going to break a bunch of stuff. But if we break it in a way that reduces the need for future discussion or changes, it would at least be a small win in the long run. > If, for example, we decide to move > postgresql.conf to etc/postgresql.conf, that is going to break a metric > ton of stuff that doesn't need to get broken AFAICS. Sure, I'd be content with leaving those in the top-level instead of say "etc". -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 2016-08-26 11:53:21 -0400, Peter Eisentraut wrote: > On 8/25/16 10:39 PM, Michael Paquier wrote: > > I am relaunching $subject as 10 development will begin soon. As far as > > I know, there is agreement that we can do something here. Among the > > different proposals I have found: > > - pg_clog renamed to pg_commit_status, pg_xact or pg_commit > > - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal > > If we're going to do some renaming, then I suggest we do a > mini-file-system structure under $PGDATA, like > > $PGDATA/etc > $PGDATA/log > $PGDATA/run (lock files etc.) > $PGDATA/tmp > $PGDATA/var > > The names of all the things under "var" could still be refined, but it's > much less likely that users will confuse data with configuration or > plain logs under that scheme. Splitting of ephemeral data seems to have a benefit, the rest seems more like rather noisy busywork to me. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 8/26/16 11:58 AM, Magnus Hagander wrote: > $PGDATA/etc >> $PGDATA/log >> $PGDATA/run (lock files etc.) >> $PGDATA/tmp >> $PGDATA/var >> >> The names of all the things under "var" could still be refined, but it's >> much less likely that users will confuse data with configuration or >> plain logs under that scheme > > Interesting idea. I worry a bit that this might encourage distributions > to split it up into different places though, and I'm not sure we want to > encourage that.. They already do this. This would just formalize where we think appropriate boundaries between the pieces are. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On Fri, Aug 26, 2016 at 04:33:47PM -0300, Euler Taveira wrote: > On 26-08-2016 14:03, David Fetter wrote: > > Would these make sense as pg_ctl options, or are you separating them > > out because they're not instance-wide? If separating them is > > important on those grounds, how about something like pg_db or > > pg_db_command? > > > It doesn't make sense because pg_ctl is server-side and pg_command would > be client-side. Perfect! > >> pg_oid2name: I don't have a strong opinion that it fits in pg_command; > > > > I vaguely knew that this existed, but I can't recall having heard of > > anybody actually using it. I suppose it's under pg_ctl if the split > > above between instance-wide and db-specific holds. > > > I don't use it for a long time. It also a client-side binary then better > place for it is pg_command. BTW, is anybody using it? If so, we could > add this functionality to psql and remove it. Sure. For server versions 10 or better, it could be a call to a new server-side function. Best, David. -- David Fetterhttp://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 08/26/2016 07:03 PM, David Fetter wrote: On Fri, Aug 26, 2016 at 01:26:39PM -0300, Euler Taveira wrote: Hi, ... >> initdb: we already have 'pg_ctl init' (since 9.0) and could remove initdb. Opinions? +1 for removing initdb. We can't just remove it because pg_ctl actually calls initdb, so we would have to move the code from src/bin/initdb to src/bin/pg_ctl. Which is doable, but I assume we'd also change how we pass parameters to it - right now we have to do this: pg_ctl init -o "... initdb params ..." But I assume we'd change this to regular parameters, if 'init' gets merged into pg_ctl - otherwise we'd force users to construct the param string only to immediately parse it (currently the "system()" call does that for us, but we'd have to implement that in pg_ctl). That however means that not only we break the scripts for all those calling initdb directly (and there's quite a few of them, I'm sure, exactly because it's simpler to pass the parameters directly), but also for all those already using pg_ctl. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
Simon Riggs wrote: > On 26 August 2016 at 18:26, Euler Taveirawrote: > > > I'm bringing this $subject into discussion again. Historically, we are > > carrying binary names that have been confused newbies. createuser is the > > worst name so for. Also, names like createdb, initdb, reindexdb, and > > droplang does not suggest what product it is referring to. Adding a > > prefix (pg_, pg, ...) would 'make things clear'. If we have a consensus > > about this change, I suggest renaming the following binaries: > Why not just remove them all and change the docs to suggest using > > psql -c "CREATE DATABASE foo" I agree that some of these commands are redundant, and that we seem to have them just because we can, not because they provide useful functionality: createdb createlang createuser dropdb droplang dropuser I would rather have a single command that can do all those actions. We could have things like pg_ctl createuser I agree with Simon that many "serious users" would not be using the shell to implement this functionality, but something higher-level (a GUI, some web app) that's going to go through SQL instead. Still, for ad-hoc admin tasks it seems convenient to have the shell interface. Nowadays git and many other programs have a model where a single command (git) can call a number of different binaries which are not in PATH (/usr/lib/git-core in my installation). We could do something similar, where little else apart from pg_ctl, pg_dump, pgbench are in path, and most binaries are in some hidden binary directory known to pg_ctl. pg_ctl vacuum -j4 (Personally, I very often have postmaster running on its own terminal with output to stderr. It'd be a little sad to lose that functionality, but what value does it offer to regular users? I can script my way around its lack, if needed.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 26 August 2016 at 18:28, Tom Lanewrote: > Also, I'd just as soon not move/rename things that don't really need it. +1 Let's leave everything exactly as it is now... but put a small README in each directory to explain why files in it shouldn't be deleted to make space. That helps the few people who made such mistakes, but doesn't cause massive change as a result. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 2016-08-26 15:03, Andres Freund wrote: On 2016-08-26 22:01:58 +0200, Simon Riggs wrote: On 26 August 2016 at 18:26, Euler Taveirawrote: > I'm bringing this $subject into discussion again. Historically, we are > carrying binary names that have been confused newbies. createuser is the > worst name so for. Also, names like createdb, initdb, reindexdb, and > droplang does not suggest what product it is referring to. Adding a > prefix (pg_, pg, ...) would 'make things clear'. If we have a consensus > about this change, I suggest renaming the following binaries: > > clusterdb > createdb > createlang > createuser > dropdb > droplang > dropuser > initdb > oid2name > reindexdb > vacuumdb > vacuumlo Why not just remove them all and change the docs to suggest using psql -c "CREATE DATABASE foo" etc Less code, no confusion because we have just one client tool - psql. Several of them have the ability to connect to several databases, some even do that in parallel. vacuumdb being one that I've needed recently to do a number of DB's in a row. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: l...@lerctr.org US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 2016-08-26 22:01:58 +0200, Simon Riggs wrote: > On 26 August 2016 at 18:26, Euler Taveirawrote: > > > I'm bringing this $subject into discussion again. Historically, we are > > carrying binary names that have been confused newbies. createuser is the > > worst name so for. Also, names like createdb, initdb, reindexdb, and > > droplang does not suggest what product it is referring to. Adding a > > prefix (pg_, pg, ...) would 'make things clear'. If we have a consensus > > about this change, I suggest renaming the following binaries: > > > > clusterdb > > createdb > > createlang > > createuser > > dropdb > > droplang > > dropuser > > initdb > > oid2name > > reindexdb > > vacuumdb > > vacuumlo > > Why not just remove them all and change the docs to suggest using > > psql -c "CREATE DATABASE foo" > etc > > Less code, no confusion because we have just one client tool - psql. Several of them have the ability to connect to several databases, some even do that in parallel. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 08/26/2016 09:28 AM, Tom Lane wrote: Magnus Haganderwrites: On Aug 26, 2016 5:54 PM, "Peter Eisentraut" < peter.eisentr...@2ndquadrant.com> wrote: If we're going to do some renaming, then I suggest we do a mini-file-system structure under $PGDATA, like $PGDATA/etc $PGDATA/log $PGDATA/run (lock files etc.) $PGDATA/tmp $PGDATA/var The names of all the things under "var" could still be refined, but it's much less likely that users will confuse data with configuration or plain logs under that scheme Interesting idea. I worry a bit that this might encourage distributions to split it up into different places though, and I'm not sure we want to encourage that.. Yeah, I'm afraid that these names are not as well standardized as Peter probably wishes they were. Also, I'd just as soon not move/rename things that don't really need it. If, for example, we decide to move postgresql.conf to etc/postgresql.conf, that is going to break a metric ton of stuff that doesn't need to get broken AFAICS. I am not so sure that is accurate. Yes, Windows is an outlying but any Unix person is going to easily understand etc log run. Further as Linux is by far our most run platform (outside of possibly Windows) it will feel right at home for the largest growing user base. Personally I really like the idea of: $PGDATA/etc/postgresql.conf $PGDATA/log/Thu.log $PGDATA/run/postmaster.pid $PGDATA/var/base/16758 Although... wouldn't run be under var? JD regards, tom lane -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 26 August 2016 at 18:26, Euler Taveirawrote: > I'm bringing this $subject into discussion again. Historically, we are > carrying binary names that have been confused newbies. createuser is the > worst name so for. Also, names like createdb, initdb, reindexdb, and > droplang does not suggest what product it is referring to. Adding a > prefix (pg_, pg, ...) would 'make things clear'. If we have a consensus > about this change, I suggest renaming the following binaries: > > clusterdb > createdb > createlang > createuser > dropdb > droplang > dropuser > initdb > oid2name > reindexdb > vacuumdb > vacuumlo Why not just remove them all and change the docs to suggest using psql -c "CREATE DATABASE foo" etc Less code, no confusion because we have just one client tool - psql. Almost all serious users will be issuing more than one DDL command at a time, so they'll be writing a script anyway. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On 26-08-2016 14:03, David Fetter wrote: > Would these make sense as pg_ctl options, or are you separating them > out because they're not instance-wide? If separating them is > important on those grounds, how about something like pg_db or > pg_db_command? > It doesn't make sense because pg_ctl is server-side and pg_command would be client-side. >> pg_oid2name: I don't have a strong opinion that it fits in pg_command; > > I vaguely knew that this existed, but I can't recall having heard of > anybody actually using it. I suppose it's under pg_ctl if the split > above between instance-wide and db-specific holds. > I don't use it for a long time. It also a client-side binary then better place for it is pg_command. BTW, is anybody using it? If so, we could add this functionality to psql and remove it. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] old_snapshot_threshold documentation
I doubt the documentation for old_snapshot_threshold is going to be understood by many ordinary users. What is a "snapshot", first of all? Why would a snapshot be old? Why is that a problem? What can I do to avoid it? What are the circumstances in practice where this issue would arise, and what are the trade-offs? Where can I see existing snapshots and how old they are? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
Robert Haaswrites: > On Fri, Aug 26, 2016 at 11:26 AM, Tom Lane wrote: >> I'm not quite sure what to do about this. It feels a tad wrong to use >> ErrorContext as the active context during HandleParallelMessages, but >> what else should we use? Maybe this needs its very own private context >> that we can reset after each use? > If we use ErrorContext, will anything go wrong? I think it might, if we were unlucky enough to be doing this while the leader was engaged in reporting some other error for its own reasons. To avoid accumulated memory leakage over multiple worker error reports, we ought to do a MemoryContextReset at the start or end (probably start) of HandleParallelMessages, and that would be problematic if ErrorContext had some data in it already. It's possible that the scenario can't occur because CHECK_FOR_INTERRUPTS is never done inside error processing, but I would not feel very comfortable with that assumption. I'm thinking a dedicated context is the way to go. It won't take very much code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
On Fri, Aug 26, 2016 at 11:26 AM, Tom Lanewrote: > Or in short, this has confused edata and newedata. Valid coding would > be > oldcontext = MemoryContextSwitchTo(newedata->assoc_context); > rather than what is there. Oh, right. >>> (Note that in the sole >>> existing use-case, edata->assoc_context is going to have been set to >>> CurrentMemoryContext by pq_parse_errornotice, and I see no good reason to >>> assume that's very long-lived ... in fact, it looks like it's whatever >>> happens to be active when ProcessInterrupts is called, which means there's >>> probably a totally separate set of problems here having to do with >>> potential leaks into long-lived contexts.) > >> Oops. Yes. > > I'm not quite sure what to do about this. It feels a tad wrong to use > ErrorContext as the active context during HandleParallelMessages, but > what else should we use? Maybe this needs its very own private context > that we can reset after each use? If we use ErrorContext, will anything go wrong? >>> I have little use for the name of that function either, as it's not >>> necessarily going to "throw" anything. Maybe ReportErrorUsingData, >>> or something like that? > >> I deliberately avoided that sort of terminology because it need not be >> an ERROR. It can be, say, a NOTICE. It is definitely something that >> is coming from an ErrorData but it need not be an ERROR. > > Right, but to me, "throw" generally means a transfer of control, which > is exactly what this isn't going to do if the message is only a notice. Fair point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SortSupport for macaddr type
And here's a commitfest link: https://commitfest.postgresql.org/10/743/ Also, a correction to my original message: the unreferenced [1] footnote points back to the thread that included the patch for UUID SortSupport. [1] https://www.postgresql.org/message-id/CAM3SWZR4avsTwwNVUzRNbHk8v36W-QBqpoKg=ogkwwy0dkt...@mail.gmail.com On Fri, Aug 26, 2016 at 10:44:22AM -0700, Brandur wrote: > Hello, > > I've attached a patch to add SortSupport for Postgres' macaddr which has the > effect of improving the performance of sorting operations for the type. The > strategy that I employ is very similar to that for UUID, which is to create > abbreviated keys by packing as many bytes from the MAC address as possible > into > Datums, and then performing fast unsigned integer comparisons while sorting. > > I ran some informal local benchmarks, and for cardinality greater than 100k > rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For those > interested, I put a few more numbers into a small report here [2].) > > Admittedly, this is not quite as useful as speeding up sorting on a more > common > data type like TEXT or UUID, but the change still seems like a useful > performance improvement. I largely wrote it as an exercise to familiarize > myself with the Postgres codebase. > > I'll add an entry into the current commitfest as suggested by the Postgres > Wiki > and follow up here with a link. > > Thanks, and if anyone has feedback or other thoughts, let me know! > > Brandur > > [1] > https://www.postgresql.org/message-id/CAM3SWZR4avsTwwNVUzRNbHk8v36W-QBqpoKg=ogkwwy0dkt...@mail.gmail.com > [2] > https://docs.google.com/spreadsheets/d/1cUqbg9RTgSo16WrrfJJwDP1eDaWL3TNOxnIOFNpfwgA/edit?usp=sharing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] SortSupport for macaddr type
Hello, I've attached a patch to add SortSupport for Postgres' macaddr which has the effect of improving the performance of sorting operations for the type. The strategy that I employ is very similar to that for UUID, which is to create abbreviated keys by packing as many bytes from the MAC address as possible into Datums, and then performing fast unsigned integer comparisons while sorting. I ran some informal local benchmarks, and for cardinality greater than 100k rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For those interested, I put a few more numbers into a small report here [2].) Admittedly, this is not quite as useful as speeding up sorting on a more common data type like TEXT or UUID, but the change still seems like a useful performance improvement. I largely wrote it as an exercise to familiarize myself with the Postgres codebase. I'll add an entry into the current commitfest as suggested by the Postgres Wiki and follow up here with a link. Thanks, and if anyone has feedback or other thoughts, let me know! Brandur [1] https://www.postgresql.org/message-id/CAM3SWZR4avsTwwNVUzRNbHk8v36W-QBqpoKg=ogkwwy0dkt...@mail.gmail.com [2] https://docs.google.com/spreadsheets/d/1cUqbg9RTgSo16WrrfJJwDP1eDaWL3TNOxnIOFNpfwgA/edit?usp=sharing >From 73f7dbf1921eea7106c93654dd7bf1fdd0888a07 Mon Sep 17 00:00:00 2001 From: BrandurDate: Tue, 16 Aug 2016 17:07:07 -0700 Subject: [PATCH] Implement SortSupport for macaddr data type Introduces a scheme to produce abbreviated keys for the macaddr type which involves packing as many of a MAC address' six bytes as possible into a Datum (and adding two zeroed bytes of padding on a 64-bit system). Sorting routines can then take advantage of the abbreviated keys by performing fast unsigned integer comparisons, resulting in a significant performance increase when operating on > 100k rows or more. This implementation is modeled closely on the addition of SortSupport for the UUID data type, which employs a very similar strategy. --- src/backend/utils/adt/mac.c | 210 src/include/catalog/pg_amproc.h | 1 + src/include/catalog/pg_proc.h | 2 + src/include/utils/builtins.h| 1 + 4 files changed, 214 insertions(+) diff --git a/src/backend/utils/adt/mac.c b/src/backend/utils/adt/mac.c index 509315a..471c3f0 100644 --- a/src/backend/utils/adt/mac.c +++ b/src/backend/utils/adt/mac.c @@ -7,9 +7,13 @@ #include "postgres.h" #include "access/hash.h" +#include "lib/hyperloglog.h" #include "libpq/pqformat.h" +#include "port/pg_bswap.h" #include "utils/builtins.h" +#include "utils/guc.h" #include "utils/inet.h" +#include "utils/sortsupport.h" /* @@ -22,6 +26,21 @@ #define lobits(addr) \ ((unsigned long)(((addr)->d<<16)|((addr)->e<<8)|((addr)->f))) +/* sortsupport for macaddr */ +typedef struct +{ + int64 input_count;/* number of non-null values seen */ + boolestimating; /* true if estimating cardinality */ + + hyperLogLogState abbr_card; /* cardinality estimator */ +} macaddr_sortsupport_state; + +static int macaddr_cmp_internal(macaddr *a1, macaddr *a2); +static int macaddr_fast_cmp(Datum x, Datum y, SortSupport ssup); +static int macaddr_cmp_abbrev(Datum x, Datum y, SortSupport ssup); +static bool macaddr_abbrev_abort(int memtupcount, SortSupport ssup); +static Datum macaddr_abbrev_convert(Datum original, SortSupport ssup); + /* * MAC address reader. Accepts several common notations. */ @@ -318,3 +337,194 @@ macaddr_trunc(PG_FUNCTION_ARGS) PG_RETURN_MACADDR_P(result); } + +/* + * SortSupport strategy function. Populates a SortSupport struct with the + * information necessary to use comparison by abbreviated keys. + */ +Datum +macaddr_sortsupport(PG_FUNCTION_ARGS) +{ + SortSupport ssup = (SortSupport) PG_GETARG_POINTER(0); + + ssup->comparator = macaddr_fast_cmp; + ssup->ssup_extra = NULL; + + if (ssup->abbreviate) + { + macaddr_sortsupport_state *uss; + MemoryContext oldcontext; + + oldcontext = MemoryContextSwitchTo(ssup->ssup_cxt); + + uss = palloc(sizeof(macaddr_sortsupport_state)); + uss->input_count = 0; + uss->estimating = true; + initHyperLogLog(>abbr_card, 10); + + ssup->ssup_extra = uss; + + ssup->comparator = macaddr_cmp_abbrev; + ssup->abbrev_converter = macaddr_abbrev_convert; + ssup->abbrev_abort = macaddr_abbrev_abort; + ssup->abbrev_full_comparator = macaddr_fast_cmp; + + MemoryContextSwitchTo(oldcontext); + } + + PG_RETURN_VOID(); +} + +/* + * SortSupport "traditional" comparison function. Pulls two MAC addresses from + * the heap and runs a standard comparison on them. + */ +static int +macaddr_fast_cmp(Datum x, Datum y, SortSupport ssup) +{ +
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
Alvaro Herrerawrites: > Tom Lane wrote: >> Any objections? Anyone want to bikeshed the field name? I considered >> PG_DIAG_SEVERITY_NONLOCALIZED and PG_DIAG_SEVERITY_ENGLISH before settling >> on PG_DIAG_SEVERITY_ASCII, but I can't say I'm in love with that. > I didn't review the patch, but +1 on the idea. As for the name, I think > ASCII is the wrong thing (as many labels in other languages can be in > ascii too). I vote for NONLOCALIZED. > I see character "s" is already taken in the protocol; that would be my > first preference rather than A. How about Z? One of the reasons I didn't pick NONLOCALIZED was that I couldn't come up with an appropriate letter for the protocol. I guess Z is okay, but it's a bit of a stretch (especially if you're of the persuasion that wants to spell it "nonlocalised"). Maybe V for "seVerity"? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
Heikki Linnakangaswrites: > Yeah, they want people to move to their own SSL library [1]. > [1] I couldn't find any official statement, but lots of blog posts > saying the same thing. As I recall, the deprecation warning messages said that in so many words. That probably counts as an official statement ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
Tom Lane wrote: > So far as I can find, the attached is all we need to do to introduce a > new message field. (This patch doesn't address the memory-context > questions, but it does fix the localization-driven failure demonstrated > upthread.) > > Any objections? Anyone want to bikeshed the field name? I considered > PG_DIAG_SEVERITY_NONLOCALIZED and PG_DIAG_SEVERITY_ENGLISH before settling > on PG_DIAG_SEVERITY_ASCII, but I can't say I'm in love with that. I didn't review the patch, but +1 on the idea. As for the name, I think ASCII is the wrong thing (as many labels in other languages can be in ascii too). I vote for NONLOCALIZED. I see character "s" is already taken in the protocol; that would be my first preference rather than A. How about Z? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 08/26/2016 07:44 PM, Tom Lane wrote: Peter Eisentrautwrites: On 8/26/16 5:31 AM, Heikki Linnakangas wrote: I think now would be a good time to drop support for OpenSSL versions older than 0.9.8. OpenSSL don't even support 0.9.8 anymore, although there are probably distributions out there that still provide patches for it. But OpenSSL 0.9.7 and older are really not interesting for PostgreSQL 10 anymore, I think. CentOS 5 currently ships 0.9.8e. That's usually the oldest OS we want to support eagerly. Also, I get this on fully-up-to-date OS X (El Capitan): $ openssl version OpenSSL 0.9.8zh 14 Jan 2016 Ok, sold, let's remove support for OpenSSL < 0.9.8. Worth noting though is that without -Wno-deprecated-declarations, you find that Apple has sprinkled the entire OpenSSL API with deprecation warnings. That suggests that their plan for the future is to drop it rather than update it. Should we be thinking ahead to that? Yeah, they want people to move to their own SSL library [1]. I doubt they will actually remove it any time soon, but who knows. It would be a good project for someone with an OS X system and some spare time, to write a patch to build with OS X's native SSL library instead of OpenSSL. The code is structured nicely to enable that now. [1] I couldn't find any official statement, but lots of blog posts saying the same thing. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming some binaries
On Fri, Aug 26, 2016 at 01:26:39PM -0300, Euler Taveira wrote: > Hi, > > I'm bringing this $subject into discussion again. Historically, we are > carrying binary names that have been confused newbies. createuser is the > worst name so for. Also, names like createdb, initdb, reindexdb, and > droplang does not suggest what product it is referring to. Adding a > prefix (pg_, pg, ...) would 'make things clear'. +1 for pg_ . We should have done this long ago, but this is better fixed than left broken. > If we have a consensus > about this change, I suggest renaming the following binaries: > > clusterdb > createdb > createlang > createuser > dropdb > droplang > dropuser > initdb > oid2name > reindexdb > vacuumdb > vacuumlo > > Another major change related to this topic is assemble functionalities > from binaries. We currently have 34 binaries (is that a lot for a single > software?). Also, some of them have the same principle: execute a > administrative or maintenance command. IMHO, from the list above, we > could reduce it to: > > pg_command: clusterdb, createdb, dropdb, createuser, dropuser, > createlang, droplang, reindexdb, vacuumdb, vacuumlo. It also has the > advantage to allow adding new administrative/maintenance commands to it > in the future; Would these make sense as pg_ctl options, or are you separating them out because they're not instance-wide? If separating them is important on those grounds, how about something like pg_db or pg_db_command? > pg_oid2name: I don't have a strong opinion that it fits in pg_command; I vaguely knew that this existed, but I can't recall having heard of anybody actually using it. I suppose it's under pg_ctl if the split above between instance-wide and db-specific holds. > initdb: we already have 'pg_ctl init' (since 9.0) and could remove initdb. > > Opinions? +1 for removing initdb. Best, David. -- David Fetterhttp://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
Peter Eisentrautwrites: > On 8/26/16 5:31 AM, Heikki Linnakangas wrote: >> I think now would be a good time to drop support for OpenSSL versions >> older than 0.9.8. OpenSSL don't even support 0.9.8 anymore, although >> there are probably distributions out there that still provide patches >> for it. But OpenSSL 0.9.7 and older are really not interesting for >> PostgreSQL 10 anymore, I think. > CentOS 5 currently ships 0.9.8e. That's usually the oldest OS we want > to support eagerly. Also, I get this on fully-up-to-date OS X (El Capitan): $ openssl version OpenSSL 0.9.8zh 14 Jan 2016 Worth noting though is that without -Wno-deprecated-declarations, you find that Apple has sprinkled the entire OpenSSL API with deprecation warnings. That suggests that their plan for the future is to drop it rather than update it. Should we be thinking ahead to that? regards, tom lane PS: I still have 0.9.7 on some of my buildfarm critters. But I could either update them, or stop using --with-openssl there. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 8/26/16 5:31 AM, Heikki Linnakangas wrote: > I think now would be a good time to drop support for OpenSSL versions > older than 0.9.8. OpenSSL don't even support 0.9.8 anymore, although > there are probably distributions out there that still provide patches > for it. But OpenSSL 0.9.7 and older are really not interesting for > PostgreSQL 10 anymore, I think. CentOS 5 currently ships 0.9.8e. That's usually the oldest OS we want to support eagerly. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, Aug 26, 2016 at 3:13 PM, Ashutosh Bapatwrote: > > > On Fri, Aug 26, 2016 at 11:37 AM, Masahiko Sawada > wrote: >> >> On Fri, Aug 26, 2016 at 3:03 PM, Ashutosh Bapat >> wrote: >> > >> > >> > On Fri, Aug 26, 2016 at 11:22 AM, Masahiko Sawada >> > >> > wrote: >> >> >> >> On Fri, Aug 26, 2016 at 1:32 PM, Vinayak Pokale >> >> wrote: >> >> > Hi All, >> >> > >> >> > Ashutosh proposed the feature 2PC for FDW for achieving atomic >> >> > commits >> >> > across multiple foreign servers. >> >> > If a transaction make changes to more than two foreign servers the >> >> > current >> >> > implementation in postgres_fdw doesn't make sure that either all of >> >> > them >> >> > commit or all of them rollback their changes. >> >> > >> >> > We (Masahiko Sawada and me) reopen this thread and trying to >> >> > contribute >> >> > in >> >> > it. >> >> > >> >> > 2PC for FDW >> >> > >> >> > The patch provides support for atomic commit for transactions >> >> > involving >> >> > foreign servers. when the transaction makes changes to foreign >> >> > servers, >> >> > either all the changes to all the foreign servers commit or rollback. >> >> > >> >> > The new patch 2PC for FDW include the following things: >> >> > 1. The patch 0001 introduces a generic feature. All kinds of FDW that >> >> > support 2PC such as oracle_fdw, mysql_fdw, postgres_fdw etc. can >> >> > involve >> >> > in >> >> > the transaction. >> >> > >> >> > Currently we can push some conditions down to shard nodes, especially >> >> > in >> >> > 9.6 >> >> > the directly modify feature has >> >> > been introduced. But such a transaction modifying data on shard node >> >> > is >> >> > not >> >> > executed surely. >> >> > Using 0002 patch, that modify is executed with 2PC. It means that we >> >> > almost >> >> > can provide sharding solution using >> >> > multiple PostgreSQL server (one parent node and several shared node). >> >> > >> >> > For multi master, we definitely need transaction manager but >> >> > transaction >> >> > manager probably can use this 2PC for FDW feature to manage >> >> > distributed >> >> > transaction. >> >> > >> >> > 2. 0002 patch makes postgres_fdw possible to use 2PC. >> >> > >> >> > 0002 patch makes postgres_fdw to use below APIs. These APIs are >> >> > generic >> >> > features which can be used by all kinds of FDWs. >> >> > >> >> > a. Execute PREAPRE TRANSACTION and COMMIT/ABORT PREAPRED instead >> >> > of >> >> > COMMIT/ABORT on foreign server which supports 2PC. >> >> > b. Manage information of foreign prepared transactions resolver >> >> > >> >> > Masahiko Sawada will post the patch. >> >> > >> >> > >> >> >> > >> > Thanks Vinayak and Sawada-san for taking this forward and basing your >> > work >> > on my patch. >> > >> >> >> >> Still lot of work to do but attached latest patches. >> >> These are based on the patch Ashutosh posted before, I revised it and >> >> divided into two patches. >> >> Compare with original patch, patch of pg_fdw_xact_resolver and >> >> documentation are lacked. >> > >> > >> > I am not able to understand the last statement. >> >> Sorry to confuse you. >> >> > Do you mean to say that your patches do not have pg_fdw_xact_resolver() >> > and >> > documentation that my patches had? >> >> Yes. >> I'm confirming them that your patches had. > > > Thanks for the clarification. I had added pg_fdw_xact_resolver() to resolve > any transactions which can not be resolved immediately after they were > prepared. There was a comment from Kevin (IIRC) that leaving transactions > unresolved on the foreign server keeps the resources locked on those > servers. That's not a very good situation. And nobody but the initiating > server can resolve those. That functionality is important to make it a > complete 2PC solution. So, please consider it to be included in your first > set of patches. > Yeah, I know the reason why pg_fdw_xact_resolver is required. I will add it as a separated patch. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
Magnus Haganderwrites: > On Aug 26, 2016 5:54 PM, "Peter Eisentraut" < > peter.eisentr...@2ndquadrant.com> wrote: >> If we're going to do some renaming, then I suggest we do a >> mini-file-system structure under $PGDATA, like >> >> $PGDATA/etc >> $PGDATA/log >> $PGDATA/run (lock files etc.) >> $PGDATA/tmp >> $PGDATA/var >> >> The names of all the things under "var" could still be refined, but it's >> much less likely that users will confuse data with configuration or >> plain logs under that scheme > Interesting idea. I worry a bit that this might encourage distributions to > split it up into different places though, and I'm not sure we want to > encourage that.. Yeah, I'm afraid that these names are not as well standardized as Peter probably wishes they were. Also, I'd just as soon not move/rename things that don't really need it. If, for example, we decide to move postgresql.conf to etc/postgresql.conf, that is going to break a metric ton of stuff that doesn't need to get broken AFAICS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Renaming some binaries
Hi, I'm bringing this $subject into discussion again. Historically, we are carrying binary names that have been confused newbies. createuser is the worst name so for. Also, names like createdb, initdb, reindexdb, and droplang does not suggest what product it is referring to. Adding a prefix (pg_, pg, ...) would 'make things clear'. If we have a consensus about this change, I suggest renaming the following binaries: clusterdb createdb createlang createuser dropdb droplang dropuser initdb oid2name reindexdb vacuumdb vacuumlo Another major change related to this topic is assemble functionalities from binaries. We currently have 34 binaries (is that a lot for a single software?). Also, some of them have the same principle: execute a administrative or maintenance command. IMHO, from the list above, we could reduce it to: pg_command: clusterdb, createdb, dropdb, createuser, dropuser, createlang, droplang, reindexdb, vacuumdb, vacuumlo. It also has the advantage to allow adding new administrative/maintenance commands to it in the future; pg_oid2name: I don't have a strong opinion that it fits in pg_command; initdb: we already have 'pg_ctl init' (since 9.0) and could remove initdb. Opinions? -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unsupported feature F867: WITH TIES
On 8/26/16 9:06 AM, Jürgen Purtz wrote: > Actually we don't support the SQL feature F867 "FETCH FIRST clause: WITH > TIES option". On the other side we support the window function "rank()" > which acts like the "WITH TIES option". My questions are: Is it hard to > implement the "WITH TIES option"? Are there plans for a realization / > how do we decide in general what to do next? Differs the semantic of the > "WITH TIES option" significantly from the "rank()" window function or > can we treat it as some kind of 'syntactical sugar'? LIMIT/FETCH FIRST works at a level that is quite far removed from data type semantics, which is what you'd need to have handy to compute ties. LIMIT basically just tells the executor to stop after getting a certain number of rows. So implementing WITH TIES would be very difficult in the current setup. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On Aug 26, 2016 5:54 PM, "Peter Eisentraut" < peter.eisentr...@2ndquadrant.com> wrote: > > On 8/25/16 10:39 PM, Michael Paquier wrote: > > I am relaunching $subject as 10 development will begin soon. As far as > > I know, there is agreement that we can do something here. Among the > > different proposals I have found: > > - pg_clog renamed to pg_commit_status, pg_xact or pg_commit > > - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal > > If we're going to do some renaming, then I suggest we do a > mini-file-system structure under $PGDATA, like > > $PGDATA/etc > $PGDATA/log > $PGDATA/run (lock files etc.) > $PGDATA/tmp > $PGDATA/var > > The names of all the things under "var" could still be refined, but it's > much less likely that users will confuse data with configuration or > plain logs under that scheme Interesting idea. I worry a bit that this might encourage distributions to split it up into different places though, and I'm not sure we want to encourage that.. /Magnus
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
I wrote: > After sleeping on it, I think the right answer is to introduce the new > error-message field (and not worry about 9.5). Will work on a patch > for that, unless I hear objections pretty soon. So far as I can find, the attached is all we need to do to introduce a new message field. (This patch doesn't address the memory-context questions, but it does fix the localization-driven failure demonstrated upthread.) Any objections? Anyone want to bikeshed the field name? I considered PG_DIAG_SEVERITY_NONLOCALIZED and PG_DIAG_SEVERITY_ENGLISH before settling on PG_DIAG_SEVERITY_ASCII, but I can't say I'm in love with that. regards, tom lane diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index f22e3da..40ae0ff 100644 *** a/doc/src/sgml/libpq.sgml --- b/doc/src/sgml/libpq.sgml *** char *PQresultErrorField(const PGresult *** 2767,2772 --- 2767,2788 + + PG_DIAG_SEVERITY_ASCII + + + The severity; the field contents are ERROR, + FATAL, or PANIC (in an error message), + or WARNING, NOTICE, DEBUG, + INFO, or LOG (in a notice message). + This is identical to the PG_DIAG_SEVERITY field except + that the contents are never localized. This is present only in + reports generated by PostgreSQL versions 9.6 + and later. + + + + PG_DIAG_SQLSTATE diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 9c96d8f..9bddb19 100644 *** a/doc/src/sgml/protocol.sgml --- b/doc/src/sgml/protocol.sgml *** message. *** 4884,4889 --- 4884,4908 + A + + + + Severity: the field contents are + ERROR, FATAL, or + PANIC (in an error message), or + WARNING, NOTICE, DEBUG, + INFO, or LOG (in a notice message). + This is identical to the S field except + that the contents are never localized. This is present only in + messages generated by PostgreSQL versions 9.6 + and later. + + + + + + C diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c index 921242f..b04a7ef 100644 *** a/src/backend/libpq/pqmq.c --- b/src/backend/libpq/pqmq.c *** pq_parse_errornotice(StringInfo msg, Err *** 237,246 switch (code) { case PG_DIAG_SEVERITY: if (strcmp(value, "DEBUG") == 0) ! edata->elevel = DEBUG1; /* or some other DEBUG level */ else if (strcmp(value, "LOG") == 0) ! edata->elevel = LOG; /* can't be COMMERROR */ else if (strcmp(value, "INFO") == 0) edata->elevel = INFO; else if (strcmp(value, "NOTICE") == 0) --- 237,262 switch (code) { case PG_DIAG_SEVERITY: + /* ignore, trusting we'll get a nonlocalized version */ + break; + case PG_DIAG_SEVERITY_ASCII: if (strcmp(value, "DEBUG") == 0) ! { ! /* ! * We can't reconstruct the exact DEBUG level, but ! * presumably it was >= client_min_messages, so select ! * DEBUG1 to ensure we'll pass it on to the client. ! */ ! edata->elevel = DEBUG1; ! } else if (strcmp(value, "LOG") == 0) ! { ! /* ! * It can't be LOG_SERVER_ONLY, or the worker wouldn't ! * have sent it to us; so LOG is the correct value. ! */ ! edata->elevel = LOG; ! } else if (strcmp(value, "INFO") == 0) edata->elevel = INFO; else if (strcmp(value, "NOTICE") == 0) *** pq_parse_errornotice(StringInfo msg, Err *** 254,264 else if (strcmp(value, "PANIC") == 0) edata->elevel = PANIC; else ! elog(ERROR, "unknown error severity"); break; case PG_DIAG_SQLSTATE: if (strlen(value) != 5) ! elog(ERROR, "malformed sql state"); edata->sqlerrcode = MAKE_SQLSTATE(value[0], value[1], value[2], value[3], value[4]); break; --- 270,280 else if (strcmp(value, "PANIC") == 0) edata->elevel = PANIC; else ! elog(ERROR, "unrecognized error severity: \"%s\"", value); break; case PG_DIAG_SQLSTATE: if (strlen(value) != 5) ! elog(ERROR, "invalid SQLSTATE: \"%s\"", value); edata->sqlerrcode = MAKE_SQLSTATE(value[0], value[1], value[2], value[3], value[4]); break; *** pq_parse_errornotice(StringInfo msg, Err *** 308,314 edata->funcname = pstrdup(value); break; default: ! elog(ERROR, "unknown error field: %d", (int) code); break; } } --- 324,330 edata->funcname = pstrdup(value); break; default: ! elog(ERROR, "unrecognized error field code: %d", (int) code); break; } } diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index 78d441d..ab71b50
Re: [HACKERS] WAL consistency check facility
Hi Kuntal, Thanks for the patch. Current patch has no docs, no tests and no explanatory comments, so makes review quite hard. The good news is you might discover a few bugs with it, so its worth pursuing actively in this CF, though its not near to being committable. I think you should add this as part of the default testing for both check and installcheck. I can't imagine why we'd have it and not use it during testing. On 25 August 2016 at 18:41, Kuntal Ghoshwrote: > * wal_consistency_mask = 511 /* Enable consistency check mask bit*/ What does this mean? (No docs) > What needs to be done: > 1. Add support for other Resource Managers. We probably need to have a discussion as to why you think this should be Rmgr dependent? Code comments would help there. If it does, then you should probably do this by extending RmgrTable with an rm_check, so you can call it like this... RmgrTable[record->xl_rmid].rm_check I'm interested in how we handle the new generic WAL format for blocks. Surely if we can handle that then we won't need an Rmgr dependency? I'm sure you have reasons, they just need to be explained long hand - don't assume anything. > 5. Generalize the page type identification technique. Why not do this first? There are some coding guideline stuff to check as well. Thanks -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 8/25/16 10:39 PM, Michael Paquier wrote: > I am relaunching $subject as 10 development will begin soon. As far as > I know, there is agreement that we can do something here. Among the > different proposals I have found: > - pg_clog renamed to pg_commit_status, pg_xact or pg_commit > - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal If we're going to do some renaming, then I suggest we do a mini-file-system structure under $PGDATA, like $PGDATA/etc $PGDATA/log $PGDATA/run (lock files etc.) $PGDATA/tmp $PGDATA/var The names of all the things under "var" could still be refined, but it's much less likely that users will confuse data with configuration or plain logs under that scheme. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why is a newly created index contains the invalid LSN?
Amit Kapila wrote: On Thu, Aug 25, 2016 at 7:55 PM, Yury Zhuravlevwrote: Hello hackers. I have a small question. While working on an incremental backup I noticed a strange thing. Newly created index is contains the invalid LSN (0/0). Exmaple: ... For some of the indexes like btree which are built outside shared buffers, we don't write WAL unless wal_level >= REPLICA. I think Robert has explained it very well how we handle the crash recovery situation for such indexes. However, for some other indexes which don't bypass shared buffers like BRIN, GIN we do write WAL for such cases as well, so you must see LSN for those type of indexes. I am less sure, if there will be any problem, if don't write WAL for those indexes as well when wal_level < REPLICA. Thanks all. Now understand LSN strongly connected with WAL. However how difficult put last system LSN instead 0? It's not so important but will allow make use LSN more consistent. -- Yury Zhuravlev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing checks when malloc returns NULL...
Hello, Michael. Your patch [1] was marked as "Needs review" so I decided to take a look. It looks good to me. However I found dozens of places in PostgreSQL code that seem to have similar problem you are trying to fix [2]. As far as I understand these are only places left that don't check malloc/realloc/ strdup return values properly. I thought maybe you will be willing to fix they too so we could forget about this problem forever. If not I will be happy to do it. However in this case we need someone familiar with affected parts of the code who will be willing to re-check a new patch since I'm not filling particularly confident about how exactly errors should be handled in all these cases. By the way maybe someone knows other procedures besides malloc, realloc and strdup that require special attention? [1] https://commitfest.postgresql.org/10/653/ [2] http://afiskon.ru/s/15/83287ef7d2_malloc.txt -- Best regards, Aleksander Alekseev -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
Robert Haaswrites: > On Fri, Aug 26, 2016 at 10:35 AM, Tom Lane wrote: >> BTW, while I'm looking at this: what on god's green earth is >> ThrowErrorData doing copying the supplied data into edata->assoc_context? >> Surely it should be putting the data into the ErrorContext, where it's not >> going to get flushed before it can be reported? > Uh, well, perhaps I misinterpreted the comment in elog.h. It says this: > /* context containing associated non-constant strings */ > struct MemoryContextData *assoc_context; > That sure looks like it's saying that all of the pointers stored in > the ErrorData structure should be pointing into assoc_context, unless > they are constant. Indeed. The point is that every ErrorData in the errordata stack needs to, and does, have assoc_context = ErrorContext. This coding is blithely ignoring what errstart has set up and copying the data someplace else. In fact, it's pointlessly duplicating data that is *already* in the context of the source ErrorData. You should be imagining this action as being the reverse of CopyErrorData, ie copying some data back into the purview of elog.c, which is to say it should be in ErrorContext. Or in short, this has confused edata and newedata. Valid coding would be oldcontext = MemoryContextSwitchTo(newedata->assoc_context); rather than what is there. >> (Note that in the sole >> existing use-case, edata->assoc_context is going to have been set to >> CurrentMemoryContext by pq_parse_errornotice, and I see no good reason to >> assume that's very long-lived ... in fact, it looks like it's whatever >> happens to be active when ProcessInterrupts is called, which means there's >> probably a totally separate set of problems here having to do with >> potential leaks into long-lived contexts.) > Oops. Yes. I'm not quite sure what to do about this. It feels a tad wrong to use ErrorContext as the active context during HandleParallelMessages, but what else should we use? Maybe this needs its very own private context that we can reset after each use? >> I have little use for the name of that function either, as it's not >> necessarily going to "throw" anything. Maybe ReportErrorUsingData, >> or something like that? > I deliberately avoided that sort of terminology because it need not be > an ERROR. It can be, say, a NOTICE. It is definitely something that > is coming from an ErrorData but it need not be an ERROR. Right, but to me, "throw" generally means a transfer of control, which is exactly what this isn't going to do if the message is only a notice. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On Aug 26, 2016 5:13 PM, "Joshua D. Drake"wrote: > > On 08/25/2016 07:39 PM, Michael Paquier wrote: >> >> Hi all, >> >> I am relaunching $subject as 10 development will begin soon. As far as >> I know, there is agreement that we can do something here. Among the >> different proposals I have found: >> - pg_clog renamed to pg_commit_status, pg_xact or pg_commit >> - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal > > > I think the use of the pg_ prefix is redundant. Just a directory called: wal will do (for example). > > In reference to pg_xlog specifically, it should be wal. It is the Write Ahead Log, not the Journal (although I recognize they can be synonymous). All the documentation talks about Write Ahead Log. > Yes, let's avoid inventing a *third* name for it, please. It's already bad enough that we have both wal and xlog. /Magnus
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 08/25/2016 07:39 PM, Michael Paquier wrote: Hi all, I am relaunching $subject as 10 development will begin soon. As far as I know, there is agreement that we can do something here. Among the different proposals I have found: - pg_clog renamed to pg_commit_status, pg_xact or pg_commit - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal I think the use of the pg_ prefix is redundant. Just a directory called: wal will do (for example). In reference to pg_xlog specifically, it should be wal. It is the Write Ahead Log, not the Journal (although I recognize they can be synonymous). All the documentation talks about Write Ahead Log. Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 08/26/2016 03:48 AM, Magnus Hagander wrote: Same reason I'm also +1 for Stephens suggestion to put all things that should not be in a base backup into the same directory. That may break things now, but it will simplify things down the road. And doing it at the same time as renaming these things makes a lot of sense, because it causes breakage that tool-builders *have* to look at, and then they will hopefully also notice the other change. If this is done this fall, developers will have at least a year to fix their utilities. JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
On Fri, Aug 26, 2016 at 10:35 AM, Tom Lanewrote: > I wrote: >> After sleeping on it, I think the right answer is to introduce the new >> error-message field (and not worry about 9.5). Will work on a patch >> for that, unless I hear objections pretty soon. > > BTW, while I'm looking at this: what on god's green earth is > ThrowErrorData doing copying the supplied data into edata->assoc_context? > Surely it should be putting the data into the ErrorContext, where it's not > going to get flushed before it can be reported? Uh, well, perhaps I misinterpreted the comment in elog.h. It says this: /* context containing associated non-constant strings */ struct MemoryContextData *assoc_context; That sure looks like it's saying that all of the pointers stored in the ErrorData structure should be pointing into assoc_context, unless they are constant. If that's not right, I suggest rewording that comment, because I cannot think of a second interpretation of what's written there. > (Note that in the sole > existing use-case, edata->assoc_context is going to have been set to > CurrentMemoryContext by pq_parse_errornotice, and I see no good reason to > assume that's very long-lived ... in fact, it looks like it's whatever > happens to be active when ProcessInterrupts is called, which means there's > probably a totally separate set of problems here having to do with > potential leaks into long-lived contexts.) Oops. Yes. > I have little use for the name of that function either, as it's not > necessarily going to "throw" anything. Maybe ReportErrorUsingData, > or something like that? I deliberately avoided that sort of terminology because it need not be an ERROR. It can be, say, a NOTICE. It is definitely something that is coming from an ErrorData but it need not be an ERROR. Also, I think "throwing an error" is pretty standard terminology that is understandable to pretty much all programmers these days. It's not a perfect name; maybe ReportErrorData would have been better, but changing that seems like pointless tinkering at this stage, from my point of view. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Euler Taveirawrites: > > On 26-08-2016 09:25, Devrim Gündüz wrote: > >> ...and we also have "pg_logical", that includes a "log" keyword already... > > > "clog" and "xlog" is almost "log"; "logical" is not. I don't imagine > > people confusing "logical" meaning "log". > > Well, I dunno; people with a weak grasp of English might have an issue > there. But I never liked that directory name anyway; my problem with > it is I read "pg_logical" and think "logical what?". Naming things using > a disconnected adjective is not good, especially one with as many > potential applications as that one has. If we're up for renaming things > under PGDATA, that one is high on my hit list. iirc, pg_logical also has both temporary and non-temporary data in it too. That complicates things for backup utilities that are trying to exclude all temporary files. If we actually move all the temp files into their own directory (or tree), then we're changing what's in pg_logical anyway, so renaming it seems like a good idea. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Renaming of pg_xlog and pg_clog
Euler Taveirawrites: > On 26-08-2016 09:25, Devrim Gündüz wrote: >> ...and we also have "pg_logical", that includes a "log" keyword already... > "clog" and "xlog" is almost "log"; "logical" is not. I don't imagine > people confusing "logical" meaning "log". Well, I dunno; people with a weak grasp of English might have an issue there. But I never liked that directory name anyway; my problem with it is I read "pg_logical" and think "logical what?". Naming things using a disconnected adjective is not good, especially one with as many potential applications as that one has. If we're up for renaming things under PGDATA, that one is high on my hit list. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 26-08-2016 09:25, Devrim Gündüz wrote: > On Fri, 2016-08-26 at 21:12 +0900, Michael Paquier wrote: >> In short, with the current names, sometimes users think that pg_xlog >> and pg_clog are just logs. And so it is fine to delete them to free up >> space, corrupting their cluster, because they are just *logs*. > > ...and we also have "pg_logical", that includes a "log" keyword already... > "clog" and "xlog" is almost "log"; "logical" is not. I don't imagine people confusing "logical" meaning "log". -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
I wrote: > After sleeping on it, I think the right answer is to introduce the new > error-message field (and not worry about 9.5). Will work on a patch > for that, unless I hear objections pretty soon. BTW, while I'm looking at this: what on god's green earth is ThrowErrorData doing copying the supplied data into edata->assoc_context? Surely it should be putting the data into the ErrorContext, where it's not going to get flushed before it can be reported? (Note that in the sole existing use-case, edata->assoc_context is going to have been set to CurrentMemoryContext by pq_parse_errornotice, and I see no good reason to assume that's very long-lived ... in fact, it looks like it's whatever happens to be active when ProcessInterrupts is called, which means there's probably a totally separate set of problems here having to do with potential leaks into long-lived contexts.) I have little use for the name of that function either, as it's not necessarily going to "throw" anything. Maybe ReportErrorUsingData, or something like that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GIN logging GIN_SEGMENT_UNMODIFIED actions?
On Tue, May 10, 2016 at 9:57 PM, Alexander Korotkovwrote: > Hi! > > On Mon, May 9, 2016 at 10:46 PM, Andres Freund wrote: >> >> trying to debug something I saw the following in pg_xlogdump output: >> >> rmgr: Gin len (rec/tot): 0/ 274, tx: 0, lsn: >> 1C/DF28AEB0, prev 1C/DF289858, desc: VACUUM_DATA_LEAF_PAGE 3 segments: 5 >> unknown action 0 ???, blkref #0: rel 1663/16384/16435 blk 310982 >> >> note the "segments: 5 unknown action 0 ???" bit. That doesn't seem >> right, given: >> #define GIN_SEGMENT_UNMODIFIED 0 /* no action (not used in >> WAL records) */ > > > I've checked GIN code. Have no idea of how such wal record could be > generated... I encountered the same issue when executing the following queries and running pg_xlogdump. CREATE EXTENSION pg_trgm; CREATE TABLE test (col1 TEXT); CREATE INDEX testidx ON test USING gin (col1 gin_trgm_ops) WITH (fastupdate = off); INSERT INTO test SELECT 'ABCDE' FROM generate_series(1,1); DELETE FROM test; VACUUM test; $ pg_xlogdump data/pg_xlog/00010004 | grep Gin | grep action rmgr: Gin len (rec/tot): 0/88, tx: 0, lsn: 0/04A4B468, prev 0/04A4B438, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 11 rmgr: Gin len (rec/tot): 0/78, tx: 0, lsn: 0/04A4B4C0, prev 0/04A4B468, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 10 rmgr: Gin len (rec/tot): 0/88, tx: 0, lsn: 0/04A4B510, prev 0/04A4B4C0, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 13 rmgr: Gin len (rec/tot): 0/78, tx: 0, lsn: 0/04A4B568, prev 0/04A4B510, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 12 rmgr: Gin len (rec/tot): 0/88, tx: 0, lsn: 0/04A4B5B8, prev 0/04A4B568, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 15 rmgr: Gin len (rec/tot): 0/78, tx: 0, lsn: 0/04A4B610, prev 0/04A4B5B8, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 14 rmgr: Gin len (rec/tot): 0/88, tx: 0, lsn: 0/04A4B660, prev 0/04A4B610, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 17 rmgr: Gin len (rec/tot): 0/78, tx: 0, lsn: 0/04A4B6B8, prev 0/04A4B660, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 16 rmgr: Gin len (rec/tot): 0/88, tx: 0, lsn: 0/04A4B708, prev 0/04A4B6B8, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 19 rmgr: Gin len (rec/tot): 0/78, tx: 0, lsn: 0/04A4B760, prev 0/04A4B708, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 18 rmgr: Gin len (rec/tot): 0/88, tx: 0, lsn: 0/04A4B7B0, prev 0/04A4B760, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 21 rmgr: Gin len (rec/tot): 0/78, tx: 0, lsn: 0/04A4B808, prev 0/04A4B7B0, desc: VACUUM_DATA_LEAF_PAGE 1663 segments: 0 unknown action 0 ???, blkref #0: rel 1663/13286/16455 blk 20 Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set log_line_prefix and application name in test drivers
Hello Peter, log_line_prefix = '%t [%p]: [%l] %qapp=%a ' which is modeled after the pgfouine recommendation, which is I believe a wide-spread convention, and it also vaguely follows syslog customs. The build farm client has log_line_prefix = '%m [%c:%l] ' which is very similar, but the lack of the PID makes it unsuitable for the purposes that I have set out, and there is no obvious place to put additional information such as %a. %m vs %t is obviously a minor issue that I will gladly adjust, but besides that I prefer to stick with my version. v2 patch looks ok, applies without trouble and works as intended: 2016-08-26 09:19:31.191 CEST [7571]: [58] app=pg_regress/event_trigger STATEMENT: alter type rewritetype alter attribute a type varchar cascade; About the format: '[\d+]' pattern is used twice, which makes the output less easily grep-able. Also, the ':' is used as a separator in the remainder of the message, so maybe once is enough at this level. I'm not sure about the "app=" is really necessary, given its very explicit definition as can be seen above above. So I would suggest something like the following, which is also a little bit more compact: log_line_prefix = '%m [%p:%l] %q%a ' If you want to keep something with %a, maybe parentheses? Finally I'm wondering also whether a timestamp since the server has started (which does not exists) would be more useful for a "make check", or at default maybe %n? -- Fabien -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL consistency check facility
Kuntal Ghosh wrote: > Thanks a lot. > > I just want to mention the situation where I was getting the > speculative token related inconsistency. > > ItemPointer in backup page from master: > LOG: ItemPointer BlockNumber: 1 OffsetNumber:65534 Speculative: true > CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1 > > ItemPointer in current page from slave after redo: > LOG: ItemPointer BlockNumber: 0 OffsetNumber:1 Speculative: false > CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1 > > As the block numbers are different, I was getting the following warning: > WARNING: Inconsistent page (at byte 8166) found for record > 0/127F4A48, rel 1663/16384/16946, forknum 0, blkno 0, Backup Page > Header : (pd_lower: 28 pd_upper: 8152 pd_special: 8192) Current Page > Header: (pd_lower: 28 pd_upper: 8152 pd_special: 8192) > CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1 > > In heap_xlog_insert, t_ctid is always set to blkno and xlrec->offnum. > I think this is why I was getting the above warning. Umm, really? Then perhaps this *is* a bug. Peter? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump with tables created in schemas created by extensions
Hi, 2016-08-25 8:10 GMT-03:00 Michael Paquier: > On Thu, Aug 25, 2016 at 10:25 AM, Martín Marqués > wrote: >> 2016-08-24 21:34 GMT-03:00 Michael Paquier : >>> >>> Yes, you are right. If I look at the diffs this morning I am seeing >>> the ACLs being dumped for this aggregate. So we could just fix the >>> test and be done with it. I did not imagine the index issue though, >>> and this is broken for some time, so that's not exclusive to 9.6 :) >> >> Do you see any easier way than what I mentioned earlier (adding a >> selectDumpableIndex() function) to fix the index dumping issue? > > Yes, we are going to need something across those lines. And my guess > is that this is going to be rather close to getOwnedSeqs() in terms of > logic. I was able to get this fixed without any further new functions (just using the dump/dump_contains and applying the same fix on selectDumpableTable). Main problem relied here in getIndexes() @@ -6158,7 +6167,7 @@ getIndexes(Archive *fout, TableInfo tblinfo[], int numTables) continue; /* Ignore indexes of tables whose definitions are not to be dumped */ - if (!(tbinfo->dobj.dump & DUMP_COMPONENT_DEFINITION)) + if (!(tbinfo->dobj.dump_contains & DUMP_COMPONENT_DEFINITION)) continue; if (g_verbose) But we have to set dump_contains with correct values. There's still one issue, which I'll add a test for as well, which is that if the index was created by the extension, it will be dumped anyway. I'll ave a look at that as well. One other thing I found was that one of the CREATE INDEX tests had incorrectly set like and unlike for pre_data and post_data. (indexes are dumped in section post_data) That's been fixes as well. I've cleaned up the patch a bit, so this is v3 with all checks passing. I'll add that new test regarding dumping an index created by the extension (which will fail) and look for ways to fix it. Regards, -- Martín Marquéshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services pgdump-extension-v3.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd oid-system-column handling in postgres_fdw
On 04/05/2016 11:15 AM, Etsuro Fujita wrote: On 2016/03/16 16:25, Etsuro Fujita wrote: PG9.5 allows us to add an oid system column to foreign tables, using ALTER FOREIGN TABLE SET WITH OIDS, but currently, that column reads as zeroes in postgres_fdw. That seems to me like a bug. So, I'd like to propose to fix that, by retrieving that column from the remote server when requested. I'm attaching a proposed patch for that. I rebased the patch against HEAD. Updated patch attached. Committed, thanks! - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Showing parallel status in \df+
Pavel Stehulewrites: > Using footer for this purpose is little bit strange. What about following > design? > 1. move out source code of PL functions from \df+ > 2. allow not unique filter in \sf and allow to display multiple functions Wasn't that proposed and rejected upthread? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] RBTree iteration interface improvement
Hello, Heikki. Thank you for your attention to this patch! > This also seems to change the API so that instead of a single > rb_begin_iterate()+rb_iterate() pair, there is a separate begin and > iterate function for each traversal order. That seems like an unrelated > change. Was there a particular reason for that? I think the old > rb_begin_iterate()+rb_iterate() functions were fine, we just need to > have a RBTreeIterator struct that's different from the tree itself. I'm trying to avoid calling procedures by a pointer, an old habit. When I started to work on this patch I just needed a RB-tree implementation for a pet project. I took one from PostgreSQL code. Then I found this flaw of iteration interface and decided to fix it. The idea to merge this fix back to PostgreSQL code appeared later so I just wrote code the way I like. These days code performance depends on many factors like whether code fits into cache, i.e not only on whether you call a procedure directly or using a pointer. Until someone finds a real bottleneck here I think current rb_begin_iterate()+rb_iterate() interface should do just fine. > Another unrelated change in this patch is the addition of > rb_rightmost(). It's not used for anything, so I'm not sure what the > point is. Then again, there don't seem to be any callers of > rb_leftmost() either. It's just something I needed in tests to reach a good percent of code coverage. Implementation of rb_rightmost is trivial so we probably can do without it. > I think we should something like in the attached patch. It seems to pass > your test suite, but I haven't done any other testing on this. Does it > look OK to you? Looks good to me. -- Best regards, Aleksander Alekseev -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Unsupported feature F867: WITH TIES
Actually we don't support the SQL feature F867 "FETCH FIRST clause: WITH TIES option". On the other side we support the window function "rank()" which acts like the "WITH TIES option". My questions are: Is it hard to implement the "WITH TIES option"? Are there plans for a realization / how do we decide in general what to do next? Differs the semantic of the "WITH TIES option" significantly from the "rank()" window function or can we treat it as some kind of 'syntactical sugar'? Regards, Jürgen Purtz -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_xlogdump fails to handle WAL file with multi-page XLP_FIRST_IS_CONTRECORD data
On Wed, Mar 23, 2016 at 7:04 PM, Pavan Deolaseewrote: > > > On Wed, Mar 23, 2016 at 1:13 PM, Michael Paquier > wrote: >> >> >> + /* >> +* Compute targetRecOff. It should typically be greater than short >> +* page-header since a valid record can't , but can also be zero >> when >> +* caller has supplied a page-aligned address or when we are >> skipping >> +* multi-page continuation record. It doesn't matter though >> because >> +* ReadPageInternal() will read at least short page-header worth >> of >> +* data >> +*/ >> This needs some polishing, there is an unfinished sentence here. >> >> + targetRecOff = tmpRecPtr % XLOG_BLCKSZ; >> targetRecOff, pageHeaderSize and targetPagePtr could be declared >> inside directly the new while loop. > > > Thanks Michael for reviewing the patch. I've fixed these issues and new > version is attached. The patch looks good to me. Barring any objections, I'll push this and back-patch to 9.3 where pg_xlogdump was added. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch proposal
* Venkata B Nagothi (nag1...@gmail.com) wrote: > On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frostwrote: > > * Venkata B Nagothi (nag1...@gmail.com) wrote: > > > This behaviour will be similar to that of recovery_target="immediate" and > > > can be aliased. > > > > I don't believe we're really going at this the right way. Clearly, > > there will be cases where we'd like promotion at the end of the WAL > > stream (as we currently have) even if the recovery point is not found, > > but if the new option's "promote" is the same as "immediate" then we > > don't have that. > > My apologies for the confusion. Correction - I meant, "promote" option > would promote the database once the consistent point is reached at the > end-of-the-WAL. "consistent point" and "end-of-the-WAL" are not the same. > > recovery_target = immediate, other > > > > recovery_action_target_found = promote, pause, shutdown > > This is currently supported by the existing parameter called > "recovery_target_action" Right, sure, we could possibly use that, or create an alias, etc. > > recovery_action_target_not_found = promote, pause, shutdown > > This is exactly what newly proposed parameter will do. Then it isn't the same as 'recovery_target = immediate'. > > One question to ask is if we need to support an option for xid and time > > related to when we realize that we won't find the recovery target. If > > consistency is reached at a time which is later than the recovery target > > for time, what then? Do we go through the rest of the WAL and perform > > the "not found" action at the end of the WAL stream? If we use that > > approach, then at least all of the recovery target types are handled the > > same, but I can definitely see cases where an administrator might prefer > > an "error" option. > > Currently, this situation is handled by recovery_target_inclusive > parameter. No, that's not the same. > Administrator can choose to stop the recovery at any consistent > point before or after the specified recovery target. Is this what you were > referring to ? Not quite. > I might need a bit of clarification here, the parameter i am proposing > would be effective only if the specified recovery target is not reached and > may not be effective if the recovery goes beyond recovery target point. Ok, let's consider this scenario: Backup started at: 4/22D3B8E0, time: 2016-08-26 08:29:08 Backup completed (consistency) at: 4/22D3B8EF, 2016-08-26 08:30:00 recovery_target is not set recovery_target_time = 2016-08-26 08:29:30 recovery_target_inclusive = false In such a case, we will necessairly commit transactions which happened between 8:29:30 and 8:30:00 and only stop (I believe..) once we've reached consistency. We could possibly detect that case and throw an error instead. The same could happen with xid. Working through more scenarios would be useful, I believe. For example, what if the xid or time specified happened before the backup started? Basically, what I was trying to get at is that we might be able to detect that we'll never find a given recovery target point without actually replaying all of WAL and wondering if we should provide options to control what to do in such a case. Thanks! Stephen signature.asc Description: Digital signature
[HACKERS] shm_mq_set_sender() crash
Latest from lorikeet: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet=2016-08-26%2008%3A37%3A27 TRAP: FailedAssertion("!(vmq->mq_sender == ((void *)0))", File: "/home/andrew/bf64/root/REL9_6_STABLE/pgsql.build/../pgsql/src/backend/storage/ipc/shm_mq.c", Line: 220) Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add the "snapshot too old" feature
On Thu, Aug 25, 2016 at 4:51 PM, Alvaro Herrerawrote: > Kevin Grittner wrote: >> On Thu, Aug 25, 2016 at 2:56 PM, Alvaro Herrera >> wrote: >> >> > I'm wondering about the TestForOldSnapshot call in scanPendingInsert(). >> > Why do we apply it to the metapage buffer (line 1717 in master)? >> >> If there is any chance that GinPageGetMeta(page)->head could have >> changed from a valid block number to InvalidBlockNumber or a >> different pending-list page due to a vacuum freeing pages from the >> pending-list, the metapage must be checked -- there is no other way >> to detect that data might have disappeared. > > Hmm ... but the disappearing data is moved to regular GIN pages, right? > It doesn't just go away. I suppose that the error would be raised as > soon as we scan a GIN data page that, because of receiving data from the > pending list, has a newer LSN. That would cover things as long as data is always moved from the pending list to a data page before it is vacuumed away. If that is true, there is no need to check the metapage *or* the pending list -- but I'm pretty skeptical that this is the case. The real question is whether pages are ever removed from the pending list. > I don't know GIN in detail but perhaps > it's possible that the data is inserted into data pages in lsn A, then > removed from the pending list in lsn B (where A < B). If the snapshot's > LSN lies in between, a spurious error would be raised. The implementation does allow false positives and slightly less aggressive early cleanup than could be achieved -- in order to avoid the extra locking that would be needed to achieve higher precision. Since I expect that the threshold will usually be set to at least a couple hours, these effects should have minimal impact. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
Hi, On Fri, 2016-08-26 at 21:12 +0900, Michael Paquier wrote: > In short, with the current names, sometimes users think that pg_xlog > and pg_clog are just logs. And so it is fine to delete them to free up > space, corrupting their cluster, because they are just *logs*. ...and we also have "pg_logical", that includes a "log" keyword already... Regards, -- Devrim GÜNDÜZ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Twitter: @DevrimGunduz , @DevrimGunduzTR signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On Fri, Aug 26, 2016 at 8:31 PM, Simon Riggswrote: > On 26 August 2016 at 04:39, Michael Paquier wrote: >> Hi all, >> >> I am relaunching $subject as 10 development will begin soon. As far as >> I know, there is agreement that we can do something here. Among the >> different proposals I have found: >> - pg_clog renamed to pg_commit_status, pg_xact or pg_commit >> - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal > > Don't mean to be a party pooper, but what discussion and agreement are > we referring to here? > > If we are going to suggest doing something we really should summarize > the reason for doing it rather than assume it is self evident, cos it > certainly isn't. This thread was the previous one on the matter: http://www.postgresql.org/message-id/caaswcxcvgma9kgeu-esc6u928mw67nozvnawbpusw7r7an9...@mail.gmail.com In short, with the current names, sometimes users think that pg_xlog and pg_clog are just logs. And so it is fine to delete them to free up space, corrupting their cluster, because they are just *logs*. Personally I have seen that, and based on the thread I am not the only one. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
Robert Haaswrites: > I don't have strong feelings about this. Technically, this issue > affects 9.5 also, because pqmq.c was introduced in that release. I > don't think we want to add another error field in a released branch. > However, since there's no parallel query in 9.5, only people who are > accessing that functionality via extension code would be affected, > which might be nobody and certainly isn't a lot of people, so we could > just leave this unfixed in 9.5. After sleeping on it, I think the right answer is to introduce the new error-message field (and not worry about 9.5). Will work on a patch for that, unless I hear objections pretty soon. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] RBTree iteration interface improvement
On 08/22/2016 01:00 PM, Aleksander Alekseev wrote: It seems clear to me that the existing arrangement is hazardous for any RBTree that hasn't got exactly one consumer. I think Aleksander's plan to decouple the iteration state is probably a good one (NB: I've not read the patch, so this is not an endorsement of details). I'd go so far as to say that we should remove the old API as being dangerous, rather than preserve it on backwards-compatibility grounds. We make bigger changes than that in internal APIs all the time. Glad to hear it! I would like to propose a patch that removes the old API. I have a few questions though. Would you like it to be a separate patch, or all-in-one patch is fine in this case? I also would like to include unit/property-based tests in this (these) patch (patches). However as I understand there are currently no unit tests in PostgreSQL at all, only integration/system tests. Any advice how to do it better? OK, here is a patch. I think we could call it a _replacement_ of an old API, so there is probably no need in two separate patches. I still don't know how to include unit test and whether they should be included at all. Thus for now unit/property-based tests could be found only on GitHub [1]. As usual, if you have any questions, suggestions or notes regarding this patch, please let me know. This also seems to change the API so that instead of a single rb_begin_iterate()+rb_iterate() pair, there is a separate begin and iterate function for each traversal order. That seems like an unrelated change. Was there a particular reason for that? I think the old rb_begin_iterate()+rb_iterate() functions were fine, we just need to have a RBTreeIterator struct that's different from the tree itself. Note that the API actually used to be like that. rb_begin_iterate() used to return a palloc'd RBTreeIterator struct. It was changed in commit 0454f131, because the implementation didn't support more than one iterator at a time, which made the separate struct pointless. But we're fixing that problem now. Another unrelated change in this patch is the addition of rb_rightmost(). It's not used for anything, so I'm not sure what the point is. Then again, there don't seem to be any callers of rb_leftmost() either. I think we should something like in the attached patch. It seems to pass your test suite, but I haven't done any other testing on this. Does it look OK to you? - Heikki diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c index d6422ea..71c64e4 100644 --- a/src/backend/access/gin/ginbulk.c +++ b/src/backend/access/gin/ginbulk.c @@ -255,7 +255,7 @@ qsortCompareItemPointers(const void *a, const void *b) void ginBeginBAScan(BuildAccumulator *accum) { - rb_begin_iterate(accum->tree, LeftRightWalk); + rb_begin_iterate(accum->tree, LeftRightWalk, >tree_walk); } /* @@ -271,7 +271,7 @@ ginGetBAEntry(BuildAccumulator *accum, GinEntryAccumulator *entry; ItemPointerData *list; - entry = (GinEntryAccumulator *) rb_iterate(accum->tree); + entry = (GinEntryAccumulator *) rb_iterate(>tree_walk); if (entry == NULL) return NULL; /* no more entries */ diff --git a/src/backend/lib/rbtree.c b/src/backend/lib/rbtree.c index 4fa8a1d..048366d 100644 --- a/src/backend/lib/rbtree.c +++ b/src/backend/lib/rbtree.c @@ -28,18 +28,6 @@ #include "lib/rbtree.h" - -/* - * Values of RBNode.iteratorState - * - * Note that iteratorState has an undefined value except in nodes that are - * currently being visited by an active iteration. - */ -#define InitialState (0) -#define FirstStepDone (1) -#define SecondStepDone (2) -#define ThirdStepDone (3) - /* * Colors of nodes (values of RBNode.color) */ @@ -53,10 +41,6 @@ struct RBTree { RBNode *root; /* root node, or RBNIL if tree is empty */ - /* Iteration state */ - RBNode *cur; /* current iteration node */ - RBNode *(*iterate) (RBTree *rb); - /* Remaining fields are constant after rb_create */ Size node_size; /* actual size of tree nodes */ @@ -75,8 +59,19 @@ struct RBTree */ #define RBNIL () -static RBNode sentinel = {InitialState, RBBLACK, RBNIL, RBNIL, NULL}; +static RBNode sentinel = {RBBLACK, RBNIL, RBNIL, NULL}; +/* + * Values used in the RBTreeIterator.next_state field, with an + * InvertedWalk iterator. + */ +typedef enum InvertedWalkNextStep +{ + NextStepBegin, + NextStepUp, + NextStepLeft, + NextStepRight +} InvertedWalkNextStep; /* * rb_create: create an empty RBTree @@ -123,8 +118,6 @@ rb_create(Size node_size, Assert(node_size > sizeof(RBNode)); tree->root = RBNIL; - tree->cur = RBNIL; - tree->iterate = NULL; tree->node_size = node_size; tree->comparator = comparator; tree->combiner = combiner; @@ -437,7 +430,6 @@ rb_insert(RBTree *rb, const RBNode *data, bool *isNew) x = rb->allocfunc (rb->arg); - x->iteratorState = InitialState; x->color = RBRED; x->left = RBNIL; @@ -653,171 +645,194 @@ rb_delete(RBTree *rb,
Re: [HACKERS] Write Ahead Logging for Hash Indexes
On Thu, Aug 25, 2016 at 6:54 PM, Alvaro Herrerawrote: > Amit Kapila wrote: >> On Wed, Aug 24, 2016 at 11:46 PM, Alvaro Herrera >> wrote: > >> > Can you split the new xlog-related stuff to a new file, say hash_xlog.h, >> > instead of cramming it in hash.h? Removing the existing #include >> > "xlogreader.h" from hash.h would be nice. I volunteer for pushing any >> > preliminary header cleanup commits. >> >> So, what you are expecting here is that we create hash_xlog.h and move >> necessary functions like hash_redo(), hash_desc() and hash_identify() >> to it. Then do whatever else is required to build it successfully. >> Once that is done, I can build my patches on top of it. Is that >> right? If yes, I can work on it. > > Yes, thanks. > How about attached? If you want, I think we can one step further and move hash_redo to a new file hash_xlog.c which is required for main patch, but we can leave it for later as well. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com add_hash_xlog-v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On 26 August 2016 at 04:39, Michael Paquierwrote: > Hi all, > > I am relaunching $subject as 10 development will begin soon. As far as > I know, there is agreement that we can do something here. Among the > different proposals I have found: > - pg_clog renamed to pg_commit_status, pg_xact or pg_commit > - pg_xlog renamed to pg_xjournal, pg_wal or pg_journal Don't mean to be a party pooper, but what discussion and agreement are we referring to here? If we are going to suggest doing something we really should summarize the reason for doing it rather than assume it is self evident, cos it certainly isn't. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming of pg_xlog and pg_clog
On Fri, Aug 26, 2016 at 12:44 PM, Christoph Bergwrote: > Re: Fujii Masao 2016-08-26 cmzgv5u6oemxr-cbjro+w...@mail.gmail.com> > > > I agree on a hard break, unless we get pushback from users, and even > > > then, they can create the symlinks themselves. > > > > I strongly prefer symlink approach not to break many existing tools > > and scripts. > > Symlinks might actually be worse than removing the directories > altogether. If your backup tool fails because the pg_xlog directory is > gone, you'll hopefully notice, but if you end up with a backup that > consists merely of a copy of a symlink named pg_xlog, you might not > notice. > +1. It's *much* better to cause a clean break. That way people will notice it, and can fix it (and it will be an easy fix). An unclean break might leave people with things that look like they work, but don't. That's a lot more dangerous. Same reason I'm also +1 for Stephens suggestion to put all things that should not be in a base backup into the same directory. That may break things now, but it will simplify things down the road. And doing it at the same time as renaming these things makes a lot of sense, because it causes breakage that tool-builders *have* to look at, and then they will hopefully also notice the other change. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Renaming of pg_xlog and pg_clog
Re: Fujii Masao 2016-08-26> > I agree on a hard break, unless we get pushback from users, and even > > then, they can create the symlinks themselves. > > I strongly prefer symlink approach not to break many existing tools > and scripts. Symlinks might actually be worse than removing the directories altogether. If your backup tool fails because the pg_xlog directory is gone, you'll hopefully notice, but if you end up with a backup that consists merely of a copy of a symlink named pg_xlog, you might not notice. Christoph -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplifying the interface of UpdateMinRecoveryPoint
On 07/13/2016 04:25 PM, Michael Paquier wrote: On Wed, Jul 13, 2016 at 8:27 PM, Amit Kapilawrote: On Wed, Jul 13, 2016 at 9:19 AM, Michael Paquier wrote: Hence why not simplifying its interface and remove the force flag? One point to note is that the signature and usage of UpdateMinRecoveryPoint() is same as it was when it got introduced in commit-cdd46c76. Now the only reasons that come to my mind for introducing the force parameter was (a) it looks cleaner that way to committer (b) they have some usecase for the same in mind (c) it got have overlooked. Now, if it got introduced due to (c), then your patch does the right thing by removing it. Personally, I feel overloading the parameter for multiple purposes makes code less maintainable, so retaining as it is in HEAD has some merits. There is no way to tell what that is, but perhaps Heikki recalls something on the matter. I am just adding him in CC. No, I don't remember. Maybe the function originally used the caller-supplied 'lsn' value as the value to force-update minRecoveryPoint to. Or I anticipated that some callers might want to do that in the future. If we were to do this, it might be better to still have a 'force' variable inside the function, to keep the if()s slighltly more readable, like: bool force = XLogRecPtrIsInvalid(lsn); But even then, I don't think this makes it really any more readable overall. Not worse either, but it's a wash. I'll just mark this as rejected in the commitfest, let's move on. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] OpenSSL 1.1 breaks configure and more
On 07/05/2016 04:46 PM, Andreas Karlsson wrote: @@ -280,8 +287,9 @@ px_find_digest(const char *name, PX_MD **res) digest = px_alloc(sizeof(*digest)); digest->algo = md; - EVP_MD_CTX_init(>ctx); - if (EVP_DigestInit_ex(>ctx, digest->algo, NULL) == 0) + digest->ctx = EVP_MD_CTX_create(); + EVP_MD_CTX_init(digest->ctx); + if (EVP_DigestInit_ex(digest->ctx, digest->algo, NULL) == 0) return -1; h = px_alloc(sizeof(*h)); Now that we're calling EVP_MD_CTX_create((), which allocates memory, are we risking memory leaks? It has always been part of the contract that you have to call px_md_free(), for any context returned by px_find_digest(), but I wonder just how careful we have been about that. Before this, you would probably get away with it without leaking, if the digest implementation didn't allocate any extra memory or other resources. At least pg_digest and try_unix_std functions call px_find_digest(), and then do more palloc()s which could elog() if you run out of memory, leaking th digest struct. Highly unlikely, but I think it would be fairly straightforward to reorder those calls to eliminate the risk, so we probably should. @@ -854,6 +858,25 @@ load_dh_buffer(const char *buffer, size_t len) return dh; } +static DH * +generate_dh_params(int prime_len, int generator) +{ +#if SSLEAY_VERSION_NUMBER >= 0x00908000L + DH *dh; + + if ((dh = DH_new()) == NULL) + return NULL; + + if (DH_generate_parameters_ex(dh, prime_len, generator, NULL)) + return dh; + + DH_free(dh); + return NULL; +#else + return DH_generate_parameters(prime_len, generator, NULL, NULL); +#endif +} + I think now would be a good time to drop support for OpenSSL versions older than 0.9.8. OpenSSL don't even support 0.9.8 anymore, although there are probably distributions out there that still provide patches for it. But OpenSSL 0.9.7 and older are really not interesting for PostgreSQL 10 anymore, I think. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Wed, Aug 17, 2016 at 6:10 PM, Robert Haaswrote: >> If you are making changes in plpgsql_validator(), then shouldn't we >> make changes in plperl_validator() or plpython_validator()? I see >> that you have made changes to function CheckFunctionValidatorAccess() >> which seems to be getting called from *_validator() (* indicates >> plpgsql/plperl/plpython) functions. Is there a reason for changing >> the *_validator() function? Yes you are right, making changes in CheckFunctionValidatorAccess is sufficient, no need to make changes in *_validator (* indicates plpgsql/plperl/plpython) functions. I have changed this in attached patch.. > > Yeah, when I glanced briefly at this patch, I found myself wondering > whether all of these call sites were actually reachable (and why the > patch contained no test cases to prove it). I have also added test cases to cover all my changes. Basically this patch changes error at three places. 1. getBaseTypeAndTypmod: This is being called from domain_in exposed function (domain_in-> domain_state_setup-> getBaseTypeAndTypmod). Though this function is being called from many other places which are not exposed function, but I don't this will have any imapct, will this ? 2. lookup_type_cache: This is being called from record_in (record_in->lookup_rowtype_tupdesc-> lookup_rowtype_tupdesc_internal->lookup_type_cache). 3. CheckFunctionValidatorAccess: This is being called from all language validator functions. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com cache_lookup_failure_v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] increasing the default WAL segment size
On Wed, 24 Aug 2016 21:31:35 -0400 Robert Haaswrote: > Hi, > > I'd like to propose that we increase the default WAL segment size, > which is currently 16MB. It was first set to that value in commit > 47937403676d913c0e740eec6b85113865c6c8ab in October of 1999; prior to > that, it was 64MB. Between 1999 and now, there have been three > significant changes that make me think it might be time to rethink > this value: > > Thoughts? >From my ignorance, could block size affect this WAL size increase? In past (didn't tried with >9 versions) you can change disk block page size from 8KB to 16 KB or 32KB (or 4) modifing src/include/pg_config.h BLCKSZ 8192 and recompiling. (There are some mails in 1999-2002 about this topic) > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company --- --- Eduardo Morras -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Declarative partitioning - another take
On 2016/08/18 5:23, Robert Haas wrote: > On Wed, Aug 17, 2016 at 2:21 AM, Amit Langote >wrote: >> I am slightly tempted to eliminate the pg_partition catalog and associated >> syscache altogether and add a column to pg_class as Robert suggested. >> That way, all relid_is_partition() calls will be replaced by >> rel->rd_partbound != NULL check. But one potential problem with that >> approach is that now whenever a parent relation is opened, all the >> partition relations must be opened to get the partbound value (to form the >> PartitionDesc to be stored in parent relation's rd_partdesc). Whereas >> currently, we just look up the pg_partition catalog (or the associated >> cache) for every partition and that gets us the partbound. > > Well, you could just look up the pg_class row without opening the > relation, too. There is a system cache on pg_class.oid, after all. I Yes, I somehow didn't think of that. > think the issue is whether it's safe to read either one of those > things without a lock on the child relation. If altering the > partitioning information for a relation requires holding only > AccessExclusiveLock on that relation, and no lock on the parent, then > you really can't read the information for any child relation without > taking at least AccessShareLock. Otherwise, it might change under > you, and that would be bad. I'd imagine this won't be a problem because we take an AccessExclusiveLock on the parent when adding/removing a partition. > I'm inclined to think that changing the partitioning information for a > child is going to require AccessExclusiveLock on both the child and > the parent. That seems unfortunate from a concurrency point of view, > but we may be stuck with it: suppose you require only > ShareUpdateExclusiveLock on the parent. Well, then a concurrent read > transaction might see the partition boundaries change when it does a > relcache rebuild, which would cause it to suddenly start expecting the > data to be in a different plan in mid-transaction, perhaps even in > mid-scan. Maybe that's survivable with really careful coding, but it > seems like it's probably a bad thing. For example, it would mean that > the executor would be unable to rely on the partitioning information > in the relcache remaining stable underneath it. Moreover, the > relcache is always going to be scanned with the most recent possible > MVCC snapshot, but the transaction snapshot may be older, so such a > system creates all sorts of nasty possibilities for there to be skew > between the snapshot being used to via the data and the snapshot being > used to read the metadata that says where the data is. We do take a lock on the parent because we would be changing its partition descriptor (relcache). I changed MergeAttributes() such that an AccessExclusiveLock instead of ShareUpdateExclusiveLock is taken if the parent is a partitioned table. > This may need some more thought, but if we go with that approach of > requiring an AccessExclusiveLock on both parent and child, then it > seems to me that maybe we should consider the partitioning information > to be a property of the parent rather than the child. Just take all > the partitioning information for all children and put it in one big > node tree and store it in the pg_class or pg_partition_root entry for > the parent as one big ol' varlena. Now you can open the parent and > get all of the partitioning information for all of the children > without needing any lock on any child, and that's *really* good, > because it means that some day we might be able to do partition > elimination before locking any of the children! That would be > excellent. If we need an AccessExclusiveLock on parent to add/remove a partition (IOW, changing that child table's partitioning information), then do we need to lock the individual partitions when reading partition's information? I mean to ask why the simple syscache look-ups to get each partition's bound wouldn't do. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch proposal
On Fri, Aug 26, 2016 at 12:30 PM, Stephen Frostwrote: > * Venkata B Nagothi (nag1...@gmail.com) wrote: > > On Thu, Aug 25, 2016 at 10:59 PM, Stephen Frost > wrote: > > > I'm not a fan of the "recovery_target" option, particularly as it's > only > > > got one value even though it can mean two things (either "immediate" or > > > "not set"), but we need a complete solution before we can consider > > > deprecating it. Further, we could consider making it an alias for > > > whatever better name we come up with. > > > > The new parameter will accept options : "pause", "shutdown" and "promote" > > > > *"promote"* > > > > This option will ensure database starts up once the "immediate" > consistent > > recovery point is reached even if it is well before the mentioned > recovery > > target point (XID, Name or time). > > This behaviour will be similar to that of recovery_target="immediate" and > > can be aliased. > > I don't believe we're really going at this the right way. Clearly, > there will be cases where we'd like promotion at the end of the WAL > stream (as we currently have) even if the recovery point is not found, > but if the new option's "promote" is the same as "immediate" then we > don't have that. > My apologies for the confusion. Correction - I meant, "promote" option would promote the database once the consistent point is reached at the end-of-the-WAL. > We need to break this down into all the different possible combinations > and then come up with names for the options to define them. I don't > believe a single option is going to be able to cover all of the cases. > > The cases which I'm considering are: > > recovery target is immediate (as soon as we have consistency) > recovery target is a set point (name, xid, time, whatever) > > action to take if recovery target is found > action to take if recovery target is not found > > Generally, "action" is one of "promote", "pause", or "shutdown". > Clearly, not all actions are valid for all recovery target cases- in > particular, "immediate" with "recovery target not found" can not support > the "promote" or "pause" options. Otherwise, we can support: > I Agree. This is a valid point to consider. I might have few questions over this at a later stage. Recovery Target | Found | Action > -|-|-- > immediate| Yes| promote > immediate| Yes| pause > immediate| Yes| shutdown > > immediate| No | shutdown > > name/xid/time| Yes| promote > name/xid/time| Yes| pause > name/xid/time| Yes| shutdown > > name/xid/time| No | promote > name/xid/time| No | pause > name/xid/time| No | shutdown > > We could clearly support this with these options: > > recovery_target = immediate, other > recovery_action_target_found = promote, pause, shutdown > This is currently supported by the existing parameter called "recovery_target_action" > recovery_action_target_not_found = promote, pause, shutdown > This is exactly what newly proposed parameter will do. > One question to ask is if we need to support an option for xid and time > related to when we realize that we won't find the recovery target. If > consistency is reached at a time which is later than the recovery target > for time, what then? Do we go through the rest of the WAL and perform > the "not found" action at the end of the WAL stream? If we use that > approach, then at least all of the recovery target types are handled the > same, but I can definitely see cases where an administrator might prefer > an "error" option. > Currently, this situation is handled by recovery_target_inclusive parameter. Administrator can choose to stop the recovery at any consistent point before or after the specified recovery target. Is this what you were referring to ? I might need a bit of clarification here, the parameter i am proposing would be effective only if the specified recovery target is not reached and may not be effective if the recovery goes beyond recovery target point. Regards, Venkata B N Fujitsu Australia
Re: [HACKERS] WAL consistency check facility
Thanks a lot. I just want to mention the situation where I was getting the speculative token related inconsistency. ItemPointer in backup page from master: LOG: ItemPointer BlockNumber: 1 OffsetNumber:65534 Speculative: true CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1 ItemPointer in current page from slave after redo: LOG: ItemPointer BlockNumber: 0 OffsetNumber:1 Speculative: false CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1 As the block numbers are different, I was getting the following warning: WARNING: Inconsistent page (at byte 8166) found for record 0/127F4A48, rel 1663/16384/16946, forknum 0, blkno 0, Backup Page Header : (pd_lower: 28 pd_upper: 8152 pd_special: 8192) Current Page Header: (pd_lower: 28 pd_upper: 8152 pd_special: 8192) CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1 In heap_xlog_insert, t_ctid is always set to blkno and xlrec->offnum. I think this is why I was getting the above warning. On Thu, Aug 25, 2016 at 10:33 PM, Alvaro Herrerawrote: > Kuntal Ghosh wrote: > >> 4. For Speculative Heap tuple insert operation, there was >> inconsistency in t_ctid value. So, I've modified the t_ctid value (in >> backup page) to current block number and offset number. Need >> suggestions!! > > In speculative insertions, t_ctid is used to store the speculative > token. I think you should just mask that field out in that case (which > you can recognize because ip_posid is set to magic value 0xfffe). > > -- > Álvaro Herrerahttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Thanks & Regards, Kuntal Ghosh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, Aug 26, 2016 at 11:37 AM, Masahiko Sawadawrote: > On Fri, Aug 26, 2016 at 3:03 PM, Ashutosh Bapat > wrote: > > > > > > On Fri, Aug 26, 2016 at 11:22 AM, Masahiko Sawada > > > wrote: > >> > >> On Fri, Aug 26, 2016 at 1:32 PM, Vinayak Pokale > >> wrote: > >> > Hi All, > >> > > >> > Ashutosh proposed the feature 2PC for FDW for achieving atomic commits > >> > across multiple foreign servers. > >> > If a transaction make changes to more than two foreign servers the > >> > current > >> > implementation in postgres_fdw doesn't make sure that either all of > them > >> > commit or all of them rollback their changes. > >> > > >> > We (Masahiko Sawada and me) reopen this thread and trying to > contribute > >> > in > >> > it. > >> > > >> > 2PC for FDW > >> > > >> > The patch provides support for atomic commit for transactions > involving > >> > foreign servers. when the transaction makes changes to foreign > servers, > >> > either all the changes to all the foreign servers commit or rollback. > >> > > >> > The new patch 2PC for FDW include the following things: > >> > 1. The patch 0001 introduces a generic feature. All kinds of FDW that > >> > support 2PC such as oracle_fdw, mysql_fdw, postgres_fdw etc. can > involve > >> > in > >> > the transaction. > >> > > >> > Currently we can push some conditions down to shard nodes, especially > in > >> > 9.6 > >> > the directly modify feature has > >> > been introduced. But such a transaction modifying data on shard node > is > >> > not > >> > executed surely. > >> > Using 0002 patch, that modify is executed with 2PC. It means that we > >> > almost > >> > can provide sharding solution using > >> > multiple PostgreSQL server (one parent node and several shared node). > >> > > >> > For multi master, we definitely need transaction manager but > transaction > >> > manager probably can use this 2PC for FDW feature to manage > distributed > >> > transaction. > >> > > >> > 2. 0002 patch makes postgres_fdw possible to use 2PC. > >> > > >> > 0002 patch makes postgres_fdw to use below APIs. These APIs are > generic > >> > features which can be used by all kinds of FDWs. > >> > > >> > a. Execute PREAPRE TRANSACTION and COMMIT/ABORT PREAPRED instead > of > >> > COMMIT/ABORT on foreign server which supports 2PC. > >> > b. Manage information of foreign prepared transactions resolver > >> > > >> > Masahiko Sawada will post the patch. > >> > > >> > > >> > > > > Thanks Vinayak and Sawada-san for taking this forward and basing your > work > > on my patch. > > > >> > >> Still lot of work to do but attached latest patches. > >> These are based on the patch Ashutosh posted before, I revised it and > >> divided into two patches. > >> Compare with original patch, patch of pg_fdw_xact_resolver and > >> documentation are lacked. > > > > > > I am not able to understand the last statement. > > Sorry to confuse you. > > > Do you mean to say that your patches do not have pg_fdw_xact_resolver() > and > > documentation that my patches had? > > Yes. > I'm confirming them that your patches had. > Thanks for the clarification. I had added pg_fdw_xact_resolver() to resolve any transactions which can not be resolved immediately after they were prepared. There was a comment from Kevin (IIRC) that leaving transactions unresolved on the foreign server keeps the resources locked on those servers. That's not a very good situation. And nobody but the initiating server can resolve those. That functionality is important to make it a complete 2PC solution. So, please consider it to be included in your first set of patches. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, Aug 26, 2016 at 3:03 PM, Ashutosh Bapatwrote: > > > On Fri, Aug 26, 2016 at 11:22 AM, Masahiko Sawada > wrote: >> >> On Fri, Aug 26, 2016 at 1:32 PM, Vinayak Pokale >> wrote: >> > Hi All, >> > >> > Ashutosh proposed the feature 2PC for FDW for achieving atomic commits >> > across multiple foreign servers. >> > If a transaction make changes to more than two foreign servers the >> > current >> > implementation in postgres_fdw doesn't make sure that either all of them >> > commit or all of them rollback their changes. >> > >> > We (Masahiko Sawada and me) reopen this thread and trying to contribute >> > in >> > it. >> > >> > 2PC for FDW >> > >> > The patch provides support for atomic commit for transactions involving >> > foreign servers. when the transaction makes changes to foreign servers, >> > either all the changes to all the foreign servers commit or rollback. >> > >> > The new patch 2PC for FDW include the following things: >> > 1. The patch 0001 introduces a generic feature. All kinds of FDW that >> > support 2PC such as oracle_fdw, mysql_fdw, postgres_fdw etc. can involve >> > in >> > the transaction. >> > >> > Currently we can push some conditions down to shard nodes, especially in >> > 9.6 >> > the directly modify feature has >> > been introduced. But such a transaction modifying data on shard node is >> > not >> > executed surely. >> > Using 0002 patch, that modify is executed with 2PC. It means that we >> > almost >> > can provide sharding solution using >> > multiple PostgreSQL server (one parent node and several shared node). >> > >> > For multi master, we definitely need transaction manager but transaction >> > manager probably can use this 2PC for FDW feature to manage distributed >> > transaction. >> > >> > 2. 0002 patch makes postgres_fdw possible to use 2PC. >> > >> > 0002 patch makes postgres_fdw to use below APIs. These APIs are generic >> > features which can be used by all kinds of FDWs. >> > >> > a. Execute PREAPRE TRANSACTION and COMMIT/ABORT PREAPRED instead of >> > COMMIT/ABORT on foreign server which supports 2PC. >> > b. Manage information of foreign prepared transactions resolver >> > >> > Masahiko Sawada will post the patch. >> > >> > >> > > Thanks Vinayak and Sawada-san for taking this forward and basing your work > on my patch. > >> >> Still lot of work to do but attached latest patches. >> These are based on the patch Ashutosh posted before, I revised it and >> divided into two patches. >> Compare with original patch, patch of pg_fdw_xact_resolver and >> documentation are lacked. > > > I am not able to understand the last statement. Sorry to confuse you. > Do you mean to say that your patches do not have pg_fdw_xact_resolver() and > documentation that my patches had? Yes. I'm confirming them that your patches had. Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, Aug 26, 2016 at 11:22 AM, Masahiko Sawadawrote: > On Fri, Aug 26, 2016 at 1:32 PM, Vinayak Pokale > wrote: > > Hi All, > > > > Ashutosh proposed the feature 2PC for FDW for achieving atomic commits > > across multiple foreign servers. > > If a transaction make changes to more than two foreign servers the > current > > implementation in postgres_fdw doesn't make sure that either all of them > > commit or all of them rollback their changes. > > > > We (Masahiko Sawada and me) reopen this thread and trying to contribute > in > > it. > > > > 2PC for FDW > > > > The patch provides support for atomic commit for transactions involving > > foreign servers. when the transaction makes changes to foreign servers, > > either all the changes to all the foreign servers commit or rollback. > > > > The new patch 2PC for FDW include the following things: > > 1. The patch 0001 introduces a generic feature. All kinds of FDW that > > support 2PC such as oracle_fdw, mysql_fdw, postgres_fdw etc. can involve > in > > the transaction. > > > > Currently we can push some conditions down to shard nodes, especially in > 9.6 > > the directly modify feature has > > been introduced. But such a transaction modifying data on shard node is > not > > executed surely. > > Using 0002 patch, that modify is executed with 2PC. It means that we > almost > > can provide sharding solution using > > multiple PostgreSQL server (one parent node and several shared node). > > > > For multi master, we definitely need transaction manager but transaction > > manager probably can use this 2PC for FDW feature to manage distributed > > transaction. > > > > 2. 0002 patch makes postgres_fdw possible to use 2PC. > > > > 0002 patch makes postgres_fdw to use below APIs. These APIs are generic > > features which can be used by all kinds of FDWs. > > > > a. Execute PREAPRE TRANSACTION and COMMIT/ABORT PREAPRED instead of > > COMMIT/ABORT on foreign server which supports 2PC. > > b. Manage information of foreign prepared transactions resolver > > > > Masahiko Sawada will post the patch. > > > > > > Thanks Vinayak and Sawada-san for taking this forward and basing your work on my patch. > Still lot of work to do but attached latest patches. > These are based on the patch Ashutosh posted before, I revised it and > divided into two patches. > Compare with original patch, patch of pg_fdw_xact_resolver and > documentation are lacked. > I am not able to understand the last statement. Do you mean to say that your patches do not have pg_fdw_xact_resolver() and documentation that my patches had? OR you mean to say that my patches did not have (lacked) pg_fdw_xact_resolver() and documenation OR some combination of those? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company