Re: [HACKERS] File based Incremental backup v8
On Sat, January 31, 2015 15:14, Marco Nenciarini wrote: 0001-public-parse_filename_for_nontemp_relation.patch 0002-copydir-LSN-v2.patch 0003-File-based-incremental-backup-v8.patch Hi, It looks like it only compiles with assert enabled. This is perhaps not yet really a problem at this stage but I thought I'd mention it: make --quiet -j 8 In file included from gram.y:14403:0: scan.c: In function yy_try_NUL_trans: scan.c:10174:23: warning: unused variable yyg [-Wunused-variable] struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be unused depending upon options. */ ^ basebackup.c: In function writeBackupProfileLine: basebackup.c:1545:8: warning: format %lld expects argument of type long long int, but argument 8 has type __off_t [-Wformat=] filename); ^ basebackup.c:1545:8: warning: format %lld expects argument of type long long int, but argument 8 has type __off_t [-Wformat=] pg_basebackup.c: In function ReceiveTarFile: pg_basebackup.c:858:2: warning: implicit declaration of function assert [-Wimplicit-function-declaration] assert(res || (strcmp(basedir, -) == 0)); ^ pg_basebackup.c:865:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] gzFile ztarfile = NULL; ^ pg_basebackup.o: In function `ReceiveAndUnpackTarFile': pg_basebackup.c:(.text+0x690): undefined reference to `assert' pg_basebackup.o: In function `ReceiveTarFile': pg_basebackup.c:(.text+0xeb0): undefined reference to `assert' pg_basebackup.c:(.text+0x10ad): undefined reference to `assert' collect2: error: ld returned 1 exit status make[3]: *** [pg_basebackup] Error 1 make[3]: *** Waiting for unfinished jobs make[2]: *** [all-pg_basebackup-recurse] Error 2 make[2]: *** Waiting for unfinished jobs make[1]: *** [all-bin-recurse] Error 2 make: *** [all-src-recurse] Error 2 The configure used was: ./configure \ --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \ --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin.fast \ --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib.fast \ --with-pgport=6973 --quiet --enable-depend \ --with-extra-version=_incremental_backup_20150131_1521_08bd0c581158 \ --with-openssl --with-perl --with-libxml --with-libxslt --with-zlib A build with --enable-cassert and --enable-debug builds fine: ./configure \ --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \ --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin \ --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib \ --with-pgport=6973 --quiet --enable-depend \ --with-extra-version=_incremental_backup_20150131_1628_08bd0c581158 \ --enable-cassert --enable-debug \ --with-openssl --with-perl --with-libxml --with-libxslt --with-zlib I will further test with that. thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables
Oops forgot to forward to the list (suggestion/feature request to the list admin for the various pg lists: make the default reply to go to the list, not the sender, if at all possible). Response below: On 1/30/15, Jim Nasby jim.na...@bluetreble.com wrote: On 1/30/15 11:54 AM, Roger Pack wrote: On 1/29/15, Roger Pack rogerdpa...@gmail.com wrote: Hello. I see on this page a mention of basically a 4B row limit for tables that have BLOB's Oops I meant for BYTEA or TEXT columns, but it's possible the reasoning is the same... It only applies to large objects, not bytea or text. OK I think I figured out possibly why the wiki says this. I guess BYTEA entries 2KB will be autostored via TOAST, which uses an OID in its backend. So BYTEA has a same limitation. It appears that disabling TOAST is not an option [1]. So I guess if the number of BYTEA entries (in the sum all tables? partitioning doesn't help?) with size 2KB is 4 billion then there is actually no option there? If this occurred it might cause all sorts of things to break? [2] It's a bit more complex than that. First, toast isn't limited to bytea; it holds for ALL varlena fields in a table that are allowed to store externally. Second, the limit is actually per-table: every table gets it's own toast table, and each toast table is limited to 4B unique OIDs. Third, the OID counter is actually global, but the code should handle conflicts by trying to get another OID. See toast_save_datum(), which calls GetNewOidWithIndex(). Now, the reality is that GetNewOidWithIndex() is going to keep incrementing the global OID counter until it finds an OID that isn't in the toast table. That means that if you actually get anywhere close to using 4B OIDs you're going to become extremely unhappy with the performance of toasting new data. OK so system stability doesn't degrade per se when it wraps [since they all use that GetNewOid method or similar [?] good to know. So basically when it gets near 4B TOAST'ed rows it may have to wrap that counter and search for unused number, and for each number it's querying the TOAST table to see if it's already used, degrading performance. So I guess partitioning tables for now is an acceptable work around, good to know. Thanks much for your response, good to know the details before we dive into postgres with our 8B row table with BYTEA's in it :) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] File based Incremental backup v8
Il 29/01/15 18:57, Robert Haas ha scritto: On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: The current implementation of copydir function is incompatible with LSN based incremental backups. The problem is that new files are created, but their blocks are still with the old LSN, so they will not be backed up because they are looking old enough. I think this is trying to pollute what's supposed to be a pure fs-level operation (copy a directory) into something that is aware of specific details like the PostgreSQL page format. I really think that nothing in storage/file should know about the page format. If we need a function that copies a file while replacing the LSNs, I think it should be a new function living somewhere else. Given that the copydir function is used only during CREATE DATABASE and ALTER DATABASE SET TABLESPACE, we could move it/renaming it to a better place that clearly mark it as knowing about page format. I'm open to suggestions on where to place it an on what should be the correct name. However the whole copydir patch here should be treated as a temporary thing. It is necessary until a proper WAL logging of CREATE DATABASE and ALTER DATABASE SET TABLESPACE will be implemented to support any form of LSN based incremental backup. A bigger problem is that you are proposing to stamp those files with LSNs that are, for lack of a better word, fake. I would expect that this would completely break if checksums are enabled. I'm sorry I completely ignored checksums in previous patch. The attached one works with checksums enabled. Also, unlogged relations typically have an LSN of 0; this would change that in some cases, and I don't know whether that's OK. It shouldn't be a problem because all the code that uses unlogged relations normally skip all the WAL related operations. From the point of view of an incremental backup it is also not a problem, because restoring the backup the unlogged tables will get reinitialized because of crash recovery procedure. However if you think it is worth the effort, I can rewrite the copydir as a two pass operation detecting the unlogged tables on the first pass and avoiding the LSN update on unlogged tables. I personally think that it doesn't wort the effort unless someone identify a real path where settins LSNs in unlogged relations leads to an issue. The issues here are similar to those in http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de - basically, I think we need to make CREATE DATABASE and ALTER DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is never going to work right. If we're not going to allow that, we need to disallow hot backups while those operations are in progress. This is right, but the problem Andres reported is orthogonal with the one I'm addressing here. Without this copydir patch (or without a proper WAL logging of copydir operations), you cannot take an incremental backup after a CREATE DATABASE or ALTER DATABASE SET TABLESPACE until you get a full backup and use it as base. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001 From: Marco Nenciarini marco.nenciar...@2ndquadrant.it Date: Thu, 29 Jan 2015 12:18:47 +0100 Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation --- src/backend/storage/file/reinit.c | 58 --- src/common/relpath.c | 56 + src/include/common/relpath.h | 2 ++ 3 files changed, 58 insertions(+), 58 deletions(-) diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *** static void ResetUnloggedRelationsInTabl *** 28,35 int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 *** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer
Re: [HACKERS] TABLESAMPLE patch
On Fri, Jan 23, 2015 at 5:39 AM, Petr Jelinek p...@2ndquadrant.com wrote: On 19/01/15 07:08, Amit Kapila wrote: On Sun, Jan 18, 2015 at 12:46 AM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: I think that's actually good to have, because we still do costing and the partial index might help produce better estimate of number of returned rows for tablesample as well. I don't understand how will it help, because for tablesample scan it doesn't consider partial index at all as per patch. CREATE TABLE test_tablesample (id INT, name text) WITH (fillfactor=10); INSERT INTO test_tablesample SELECT i, repeat(i::text, 200) FROM generate_series(0, 9) s(i) ORDER BY i; INSERT INTO test_tablesample values(generate_series(10,1); create index idx_tblsample on test_tablesample(id) where id; Analyze test_tablesample; postgres=# explain SELECT id FROM test_tablesample where id ; QUERY PLAN --- Index Only Scan using idx_tblsample on test_tablesample (cost=0.13..8.14 rows= 1 width=4) Index Cond: (id ) (2 rows) postgres=# explain SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI (80) wh ere id ; QUERY PLAN Sample Scan on test_tablesample (cost=0.00..658.00 rows=8000 width=4) Filter: (id ) (2 rows) The above result also clearly indicate that when TABLESAMPLE clause is used, it won't use partial index. Well similar, not same as we are not always fetching whole page or doing visibility checks on all tuples in the page, etc. But I don't see why it can't be inside nodeSamplescan. If you look at bitmap heap scan, that one is also essentially somewhat modified sequential scan and does everything within the node nodeBitmapHeapscan because the algorithm is not used anywhere else, just like sample scan. I don't mind doing everything in nodeSamplescan, however if you can split the function, it will be easier to read and understand, if you see in nodeBitmapHeapscan, that also has function like bitgetpage(). This is just a suggestion and if you think that it can be splitted, then it's okay, otherwise leave it as it is. Refer parameter (HeapScanDesc-rs_syncscan) and syncscan.c. Basically during sequiantial scan on same table by different backends, we attempt to keep them synchronized to reduce the I/O. Ah this, yes, it makes sense for bernoulli, not for system though. I guess it should be used for sampling methods that use SAS_SEQUENTIAL strategy. Have you taken care of this in your latest patch? I don't know how else to explain, if we loop over every tuple in the relation and return it with equal probability then visibility checks don't matter as the percentage of visible tuples will be same in the result as in the relation. For example if you have 30% visible tuples and you are interested in 10% of tuples overall it will return 10% of all tuples and since every tuple has 10% chance of being returned, 30% of those should be visible (if we assume smooth distribution of random numbers generated). So in the end you are getting 10% of visible tuples because the scan will obviously skip the invisible ones and that's what you wanted. As I said problem of analyze is that it uses tuple limit instead of probability. Okay, got the point. Yes the differences is smaller (in relative numbers) for bigger tables when I test this. On 1k row table the spread of 20 runs was between 770-818 and on 100k row table it's between 79868-80154. I think that's acceptable variance and it's imho indeed the random generator that produces this. Sure, I think this is acceptable. Oh and BTW when I delete 50k of tuples (make them invisible) the results of 20 runs are between 30880 and 40063 rows. This is between 60% to 80%, lower than what is expected, but I guess we can't do much for this except for trying with reverse order for visibility test and sample tuple call, you can decide if you want to try that once just to see if that is better. I am sure it could be done with sequence scan. Not sure if it would be pretty and more importantly, the TABLESAMPLE is *not* sequential scan. The fact that bernoulli method looks similar should not make us assume that every other method does as well, especially when system method is completely different. Okay, lets keep it as separate. Anyway, attached is new version with some updates that you mentioned (all_visible, etc). I also added optional interface for the sampling method to access the tuple contents as I can imagine sampling methods that will want to do that. + /* + * Let the sampling method examine the actual tuple and decide if we + * should return it. + * + * Note that we let it examine even invisible tuples. + */ + if
Re: [HACKERS] pg_check_dir comments and implementation mismatch
Il 30/01/15 03:54, Michael Paquier ha scritto: On Fri, Jan 30, 2015 at 2:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: There is at least one other bug in that function now that I look at it: in event of a readdir() failure, it neglects to execute closedir(). Perhaps not too significant since all existing callers will just exit() anyway after a failure, but still ... I would imagine that code scanners like coverity or similar would not be happy about that. ISTM that it is better to closedir() appropriately in all the code paths. I've attached a new version of the patch fixing the missing closedir on readdir error. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it From d2bfb6878aed404fdba1447d3f813edb4442ff47 Mon Sep 17 00:00:00 2001 From: Marco Nenciarini marco.nenciar...@2ndquadrant.it Date: Sat, 31 Jan 2015 14:06:54 +0100 Subject: [PATCH] Improve pg_check_dir comments and code Update the pg_check_dir comment to explain all the possible return values (0 to 4). Fix the beaviour in presence of lost+found directory. The previous version was returning 3 (mount point) even if the readdir returns something after the lost+found directory. In this case the output should be 4 (not empty). Ensure that closedir are always called even if readdir returns an error. --- src/port/pgcheckdir.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/port/pgcheckdir.c b/src/port/pgcheckdir.c index 7061893..02a048c 100644 *** a/src/port/pgcheckdir.c --- b/src/port/pgcheckdir.c *** *** 22,28 * Returns: *0 if nonexistent *1 if exists and empty ! *2 if exists and not empty *-1 if trouble accessing directory (errno reflects the error) */ int --- 22,30 * Returns: *0 if nonexistent *1 if exists and empty ! *2 if exists and contains _only_ dot files ! *3 if exists and contains a mount point ! *4 if exists and not empty *-1 if trouble accessing directory (errno reflects the error) */ int *** pg_check_dir(const char *dir) *** 32,37 --- 34,40 DIR*chkdir; struct dirent *file; booldot_found = false; + boolmount_found = false; chkdir = opendir(dir); if (chkdir == NULL) *** pg_check_dir(const char *dir) *** 51,60 { dot_found = true; } else if (strcmp(lost+found, file-d_name) == 0) { ! result = 3; /* not empty, mount point */ ! break; } #endif else --- 54,63 { dot_found = true; } + /* lost+found directory */ else if (strcmp(lost+found, file-d_name) == 0) { ! mount_found = true; } #endif else *** pg_check_dir(const char *dir) *** 64,72 } } ! if (errno || closedir(chkdir)) result = -1;/* some kind of I/O error? */ /* We report on dot-files if we _only_ find dot files */ if (result == 1 dot_found) result = 2; --- 67,82 } } ! if (errno) result = -1;/* some kind of I/O error? */ + if (closedir(chkdir)) + result = -1;/* error executing closedir */ + + /* We report on mount point if we find a lost+found directory */ + if (result == 1 mount_found) + result = 3; + /* We report on dot-files if we _only_ find dot files */ if (result == 1 dot_found) result = 2; -- 2.2.2 signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
On 01/30/2015 10:01 PM, Amit Kapila wrote: On Fri, Jan 30, 2015 at 10:58 PM, Robert Haas robertmh...@gmail.com mailto:robertmh...@gmail.com wrote: Yes. The contents of postgresql.conf are only mildly order-dependent. If you put the same setting in more than once, it matters which one is last. Apart from that, though, it doesn't really matter: wal_keep_segments=10 means the same thing if it occurs before max_connections=401 that it means after that. The same is not true of pg_hba.conf, where the order matters a lot. Do you mean to say that as authentication system uses just the first record that matches to perform authentication, it could lead to problems if an order is not maintained? Won't the same set of problems can occur if user tries to that manually and do it without proper care of such rules. Now the problem with command is that user can't see the order in which entries are being made, but it seems to me that we can provide a view or some way to user so that the order of entries is visible and the same is allowed to be manipulated via command. We *can*, yes. But the technical issues around that have not been addressed. Certainly just making the new system view respond to UPDATE/INSERT/DELETE would not be sufficient. And then once we address the technical issues, we'll need to address the security implications. I think this is worth doing; there's some tremendous utility potential in having a PostgresQL which can be 100% managed via port 5432, especially for the emerging world of container-based hosting (Docker et. al.). However, it's also going to be difficult. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch
On 31/01/15 14:27, Amit Kapila wrote: On Fri, Jan 23, 2015 at 5:39 AM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: On 19/01/15 07:08, Amit Kapila wrote: On Sun, Jan 18, 2015 at 12:46 AM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com mailto:p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: I think that's actually good to have, because we still do costing and the partial index might help produce better estimate of number of returned rows for tablesample as well. I don't understand how will it help, because for tablesample scan it doesn't consider partial index at all as per patch. Oh I think we were talking abut different things then, I thought you were talking about the index checks that planner/optimizer sometimes does to get more accurate statistics. I'll take another look then. Well similar, not same as we are not always fetching whole page or doing visibility checks on all tuples in the page, etc. But I don't see why it can't be inside nodeSamplescan. If you look at bitmap heap scan, that one is also essentially somewhat modified sequential scan and does everything within the node nodeBitmapHeapscan because the algorithm is not used anywhere else, just like sample scan. I don't mind doing everything in nodeSamplescan, however if you can split the function, it will be easier to read and understand, if you see in nodeBitmapHeapscan, that also has function like bitgetpage(). This is just a suggestion and if you think that it can be splitted, then it's okay, otherwise leave it as it is. Yeah I can split it to separate function within the nodeSamplescan, that sounds reasonable. Refer parameter (HeapScanDesc-rs_syncscan) and syncscan.c. Basically during sequiantial scan on same table by different backends, we attempt to keep them synchronized to reduce the I/O. Ah this, yes, it makes sense for bernoulli, not for system though. I guess it should be used for sampling methods that use SAS_SEQUENTIAL strategy. Have you taken care of this in your latest patch? Not yet. I think I will need to make the strategy property of the sampling method instead of returning it by costing function so that the info can be used by the scan. Oh and BTW when I delete 50k of tuples (make them invisible) the results of 20 runs are between 30880 and 40063 rows. This is between 60% to 80%, lower than what is expected, but I guess we can't do much for this except for trying with reverse order for visibility test and sample tuple call, you can decide if you want to try that once just to see if that is better. No, that's because I can't write properly, the lower number was supposed to be 39880 which is well within the tolerance, sorry for the confusion (9 and 0 are just too close). Anyway, attached is new version with some updates that you mentioned (all_visible, etc). I also added optional interface for the sampling method to access the tuple contents as I can imagine sampling methods that will want to do that. +/* +* Let the sampling method examine the actual tuple and decide if we +* should return it. +* +* Note that we let it examine even invisible tuples. +*/ +if (OidIsValid(node-tsmreturntuple.fn_oid)) +{ +found = DatumGetBool(FunctionCall4(node-tsmreturntuple, + PointerGetDatum (node), + UInt32GetDatum (blockno), + PointerGetDatum (tuple), + BoolGetDatum (visible))); +/* XXX: better error */ +if (found !visible) +elog(ERROR, Sampling method wanted to return invisible tuple); +} You have mentioned in comment that let it examine invisible tuple, but it is not clear why you want to allow examining invisible tuple and then later return error, why can't it skips invisible tuple. Well I think returning invisible tuples to user is bad idea and that's why the check, but I also think it might make sense to examine the invisible tuple for the sampling function in case it wants to create some kind of stats about the scan and wants to use those for making the decision about returning other tuples. The interface should be probably called tsmexaminetuple instead to make it more clear what the intention is. 1. How about statistics (pgstat_count_heap_getnext()) during SampleNext as we do in heap_getnext? Right, will add. 2. +static TupleTableSlot * +SampleNext(SampleScanState *node) +{ .. +/* +* Lock the buffer so we can safely assess tuple +* visibility later. +*/ +LockBuffer(buffer, BUFFER_LOCK_SHARE); .. } When is this content lock released, shouldn't we release it after checking visibility of tuple? Here, + if (!OffsetNumberIsValid(tupoffset)) + { + UnlockReleaseBuffer(buffer); but yes you are correct, it should be just released there and we can unlock already after visibility check. 3. +static TupleTableSlot * +SampleNext(SampleScanState
Re: [HACKERS] Re: [COMMITTERS] pgsql: Another attempt at fixing Windows Norwegian locale.
Heikki Linnakangas hlinnakan...@vmware.com writes: On 01/16/2015 07:05 PM, Heikki Linnakangas wrote: On 01/16/2015 04:17 PM, Tom Lane wrote: What instructions do you have in mind to give? Ok, I have created a wiki page for these instructions: http://wiki.postgresql.org/wiki/Changes_To_Norwegian_Locale They can be moved to the release notes, or we can just add a note there with a link to the wiki page. I think the latter would be better. Suggested reference in the release notes: Migration to Version X If you are a Windows user, using the Norwegian (Bokmål) locale, manual action is needed after the upgrade, to replace the Norwegian (Bokmål)_Norway locale names stored in system catalogs with its pure-ASCII alias, Norwegian_Norway. More information is available at http://wiki.postgresql.org/wiki/Changes_To_Norwegian_Locale I've looked at this issue a bit now. I'm good with using essentially this text in the release notes, but I think the instructions are one brick shy of a load. Specifically, you claimed in the second commit that we'd not made any releases using norwegian-bokmal, but that's utterly wrong: 9.4.0 uses that spelling. What advice do we need to give 9.4 users? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb, unicode escapes and escaped backslashes
Robert Haas robertmh...@gmail.com writes: I understand Andrew to be saying that if you take a 6-character string and convert it to a JSON string and then back to text, you will *usually* get back the same 6 characters you started with ... unless the first character was \, the second u, and the remainder hexadecimal digits. Then you'll get back a one-character string or an error instead. It's not hard to imagine that leading to surprising behavior, or even security vulnerabilities in applications that aren't expecting such a translation to happen under them. That *was* the case, with the now-reverted patch that changed the escaping rules. It's not anymore: regression=# select to_json('\u1234'::text); to_json --- \\u1234 (1 row) When you convert that back to text, you'll get \u1234, no more and no less. For example: regression=# select array_to_json(array['\u1234'::text]); array_to_json --- [\\u1234] (1 row) regression=# select array_to_json(array['\u1234'::text])-0; ?column? --- \\u1234 (1 row) regression=# select array_to_json(array['\u1234'::text])-0; ?column? -- \u1234 (1 row) Now, if you put in '\u1234'::jsonb and extract that string as text, you get some Unicode character or other. But I'd say that a JSON user who is surprised by that doesn't understand JSON, and definitely that they hadn't read more than about one paragraph of our description of the JSON types. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes up for review
I wrote: First draft of release notes for the upcoming minor releases is committed at http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=77e9125e847adf76e9466814781957c0f32d8554 It should be visible on the documentation website after guaibasaurus does its next buildfarm run, a couple hours from now. BTW, that's up now at http://www.postgresql.org/docs/devel/static/release-9-4-1.html regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] File based Incremental backup v9
Il 31/01/15 17:22, Erik Rijkers ha scritto: On Sat, January 31, 2015 15:14, Marco Nenciarini wrote: 0001-public-parse_filename_for_nontemp_relation.patch 0002-copydir-LSN-v2.patch 0003-File-based-incremental-backup-v8.patch Hi, It looks like it only compiles with assert enabled. It is due to a typo (assert instead of Assert). You can find the updated patch attached to this message. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001 From: Marco Nenciarini marco.nenciar...@2ndquadrant.it Date: Thu, 29 Jan 2015 12:18:47 +0100 Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation --- src/backend/storage/file/reinit.c | 58 --- src/common/relpath.c | 56 + src/include/common/relpath.h | 2 ++ 3 files changed, 58 insertions(+), 58 deletions(-) diff --git a/src/backend/storage/file/reinit.c b/src/backend/storage/file/reinit.c index afd9255..02b5fee 100644 *** a/src/backend/storage/file/reinit.c --- b/src/backend/storage/file/reinit.c *** static void ResetUnloggedRelationsInTabl *** 28,35 int op); static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname, int op); - static bool parse_filename_for_nontemp_relation(const char *name, - int *oidchars, ForkNumber *fork); typedef struct { --- 28,33 *** ResetUnloggedRelationsInDbspaceDir(const *** 388,446 fsync_fname((char *) dbspacedirname, true); } } - - /* - * Basic parsing of putative relation filenames. - * - * This function returns true if the file appears to be in the correct format - * for a non-temporary relation and false otherwise. - * - * NB: If this function returns true, the caller is entitled to assume that - * *oidchars has been set to the a value no more than OIDCHARS, and thus - * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID - * portion of the filename. This is critical to protect against a possible - * buffer overrun. - */ - static bool - parse_filename_for_nontemp_relation(const char *name, int *oidchars, - ForkNumber *fork) - { - int pos; - - /* Look for a non-empty string of digits (that isn't too long). */ - for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) - ; - if (pos == 0 || pos OIDCHARS) - return false; - *oidchars = pos; - - /* Check for a fork name. */ - if (name[pos] != '_') - *fork = MAIN_FORKNUM; - else - { - int forkchar; - - forkchar = forkname_chars(name[pos + 1], fork); - if (forkchar = 0) - return false; - pos += forkchar + 1; - } - - /* Check for a segment number. */ - if (name[pos] == '.') - { - int segchar; - - for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); ++segchar) - ; - if (segchar = 1) - return false; - pos += segchar; - } - - /* Now we should be at the end. */ - if (name[pos] != '\0') - return false; - return true; - } --- 386,388 diff --git a/src/common/relpath.c b/src/common/relpath.c index 66dfef1..83a1e3a 100644 *** a/src/common/relpath.c --- b/src/common/relpath.c *** GetRelationPath(Oid dbNode, Oid spcNode, *** 206,208 --- 206,264 } return path; } + + /* + * Basic parsing of putative relation filenames. + * + * This function returns true if the file appears to be in the correct format + * for a non-temporary relation and false otherwise. + * + * NB: If this function returns true, the caller is entitled to assume that + * *oidchars has been set to the a value no more than OIDCHARS, and thus + * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID + * portion of the filename. This is critical to protect against a possible + * buffer overrun. + */ + bool + parse_filename_for_nontemp_relation(const char *name, int *oidchars, + ForkNumber *fork) + { + int pos; + + /* Look for a non-empty string of digits (that isn't too long). */ + for (pos = 0; isdigit((unsigned char) name[pos]); ++pos) + ; + if (pos == 0 || pos
Re: [HACKERS] POLA violation with \c service=
Hi all I am sending a review of this patch: * What it does? - Allow to connect to other db by \connect uri connection format postgres=# \c postgresql://localhost?service=old psql (9.5devel, server 9.2.9) You are now connected to database postgres as user pavel. * Would we this feature? - yes, it eliminate inconsistency between cmd line connect and \connect. It is good idea without any objections. * This patch is cleanly applicable, later compilation without any issues * All regress tests passed * A psql documentation is updated -- this feature (and format) is not widely known, so maybe some more examples are welcome * When I tested this feature, it worked as expected * Code respects PostgreSQL coding rules. I prefer a little bit different test if keep password. Current code is little bit harder to understand. But I can live with David's code well too. if (!user) user = PQuser(o_conn); if (!host) host = PQhost(o_conn); if (!port) port = PQport(o_conn); if (dbname) has_connection_string = recognized_connection_string(dbname); /* we should not to keep password if some connection property is changed */ keep_password = strcmp(user, PQuser(o_conn)) == 0 strcmp(host, PQhost(o_conn)) == 0 strcmp(port, PQport(o_conn)) == 0 !has_connection_string; I have not any other comments. Possible questions: 1. more examples in doc 2. small change how to check keep_password Regards Pavel 2015-01-13 15:00 GMT+01:00 David Fetter da...@fetter.org: On Sat, Jan 10, 2015 at 04:41:16PM -0800, David Fetter wrote: On Sat, Jan 10, 2015 at 09:30:57AM +0100, Erik Rijkers wrote: On Fri, January 9, 2015 20:15, David Fetter wrote: [psql_fix_uri_service_003.patch] Applies on master; the feature (switching services) works well but a \c without any parameters produces a segfault: (centos 6.6, 4.9.2, 64-bit) $ echo -en $PGSERVICEFILE\n$PGSERVICE\n$PGPORT\n /home/aardvark/.pg_service service_pola 6968 $ psql Timing is on. psql (9.5devel_service_pola_20150109_2340_ac7009abd228) Type help for help. testdb=# \c service=HEAD You are now connected to database testdb as user aardvark via socket in /tmp at port 6545. testdb=# \c service=service_pola You are now connected to database testdb as user aardvark via socket in /tmp at port 6968. testdb=# \c Segmentation fault (core dumped) Fixed by running that function only if the argument exists. More C cleanups, too. Added to the upcoming commitfest. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Draft release notes up for review
First draft of release notes for the upcoming minor releases is committed at http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=77e9125e847adf76e9466814781957c0f32d8554 It should be visible on the documentation website after guaibasaurus does its next buildfarm run, a couple hours from now. Please let me know of any corrections ASAP. Note that I've been fairly ruthless about removing items altogether if they were unlikely to have clear user-visible impact, because the notes were eye-glazingly long already. Please mention it if you see any remaining items that we could dispense with ... or if you think I underestimated the significance of anything that got dropped. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Draft release notes up for review
Amit Langote amitlangot...@gmail.com writes: Perhaps the following two missed? http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=deadbf4f3324f7b2826cac60dd212dfa1b0084ec http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cd63c57e5cbfc16239aa6837f8b7043a721cdd28 Hm? Those are both included ... right next to each other in fact, starting at about line 275 in release-9.4.sgml as it currently stands. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Small doc patch about pg_service.conf
On 1/3/15 7:56 PM, Tom Lane wrote: Noah Misch n...@leadboat.com writes: On Sat, Jan 03, 2015 at 02:36:50PM -0500, Noah Misch wrote: The directory libpq consults is `pg_config --sysconfdir` I was wrong there. `pg_config --sysconfig` uses get_etc_path(), which adjusts to post-installation moves of the installation tree. parseServiceInfo() uses the build-time SYSCONFDIR directly. Ugh. That makes things messy. Still, we're probably better off recommending `pg_config --sysconfdir` than anything else. I doubt that the theoretical ability to do a post-installation move is used much, and it's certainly not used by people using a prepackaged build. So the recommendation would only be wrong for someone who had done such a move manually, and they'd probably know what to do anyway. At least writing `pg_config --sysconfdir` indicates that it's in an installation-specific location, whereas hardcoding /etc will create confusion when it's not actually there. (Incidentally, we use /usr/local/pgsql/etc elsewhere in the documentation as a sample location.) I propose the attached patch. diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index e5cab37..75a4d51 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -6957,7 +6957,7 @@ titleThe Connection Service File/title at filename~/.pg_service.conf/filename or the location specified by the environment variable envarPGSERVICEFILE/envar, or it can be a system-wide file - at filename/etc/pg_service.conf/filename or in the directory + at filename`pg_config --sysconfdir`/pg_service.conf/filename or in the directory specified by the environment variable envarPGSYSCONFDIR/envar. If service definitions with the same name exist in the user and the system file, the user file takes -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
On 2015-01-30 18:59:28 +0100, Heikki Linnakangas wrote: On 01/15/2015 03:03 AM, Andres Freund wrote: 0004: Process 'die' interrupts while reading/writing from the client socket. This is the reason Horiguchi-san started this thread. +ProcessClientWriteInterrupt(!port-noblock); ... +/* + * ProcessClientWriteInterrupt() - Process interrupts specific to client writes + * + * This is called just after low-level writes. That might be after the read + * finished successfully, or it was interrupted via interrupt. 'blocked' tells + * us whether the + * + * Must preserve errno! + */ +void +ProcessClientWriteInterrupt(bool blocked) You're passing port-noblock as argument, but I thought the argument is supposed to mean whether the write would've blocked, i.e. if the write buffer was full. Right. port-noblock doesn't mean that. But perhaps I misunderstood this - the comment on the 'blocked' argument above is a bit incomplete ;-). The point here is that we tried the send() and then were interrupted while waiting for the buffer to become writable. That pretty much implies that the write buffer is full. The !port-noblock is because we only are blocked on write if we're doinga blocking send, otherwise we'll just return control to the caller... I agree that this could use some more comments ;) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tablespaces inside $PGDATA considered harmful
it is just as likely they simply are not aware of the downsides and the only reason they put it is $PGDATA is that it seemed like a logical place to put a directory that is intended to hold database data. Yes, this is the reason why we got in this issue. The name PGDATA is misleading. The creators of tablespaces seem to have envisioned their usage as a means of pulling in disparate file systems and not simply for namespaces within the main filesystem that $PGDATA exists on. true too. We have a lot of tablespaces. I'd probably won't go that way by now, but it still has the advantage to help quickly move parts of the data to manage filesystem usage. Given all this, it seems like a good idea to at least give a warning if somebody tries to create a tablespace instead the data directory. IMHO the first place to put a warning is within the documentation: http://www.postgresql.org/docs/9.4/interactive/manage-ag-tablespaces.html and possibly a crosslink in http://www.postgresql.org/docs/9.4/interactive/sql-createtablespace.html If this is intended to be back-patched then I'd go with just a warning. If this is strictly 9.5 material then I'd say that since our own tools behave badly in the current situation we should simply outright disallow it. We have a lot of maintenance scripts that rely on our architecture ($PGDADAT - symlinks - tablespace locations). We already made a quick evaluation on how to fix this, but gave it up for now due to the work amount. So please be cautious about disallowing it too abruptly. Back-patching a change that disallow our current architecture could prevent us to apply minor releases for a while... regards, Marc Mamin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Safe memory allocation functions
On Sat, Jan 31, 2015 at 2:58 AM, Robert Haas robertmh...@gmail.com wrote: Committed. I didn't think we really need to expose two separate flags for the aligned and unaligned cases, so I ripped that out. I also removed the duplicate documentation of the new constants in the function header; having two copies of the documentation, one far removed from the constants themselves, is a recipe for them eventually getting out of sync. I also moved the function to what I thought was a more logical place in the file, and rearranged the order of tests slightly so that, in the common path, we test only whether ret == NULL and not anything else. Thanks a lot! I have reworked a bit the module used for the tests of the new extended routine (with new tests for Alloc, AllocZero and AllocHuge as well) and pushed it here: https://github.com/michaelpq/pg_plugins/tree/master/mcxtalloc_test This is just to mention it for the sake of the archives, I cannot really believe that it should be an in-core module in src/test/modules as it does allocations of 1GB in environments where there is enough memory. Note that it contains regression tests with alternate outputs depending on the environment where they are run (still output may not be stable depending on where those tests are run, particularly where memory usage varies a lot). -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
On 2015-01-31 00:52:03 +0100, Heikki Linnakangas wrote: On 01/15/2015 03:03 AM, Andres Freund wrote: 0004: Process 'die' interrupts while reading/writing from the client socket. + * Check for interrupts here, in addition to secure_write(), + * because a interrupted write in secure_raw_write() can possibly + * only return to here, not to secure_write(). */ +ProcessClientWriteInterrupt(true); Took me a while to parse that sentence. How about: Check for interrupts here, in addition to secure_write(), because an interrupted write in secure_raw_write() will return here, and we cannot return to secure_write() until we've written something. Yep, that's better. 0005: Move deadlock and other interrupt handling in proc.c out of signal handlers. I'm surprised how comparatively simple this turned out to be. For robustness and understanding I think this is a great improvement. New and not reviewed at all. Needs significant review. But works in my simple testcases. @@ -799,6 +791,7 @@ ProcKill(int code, Datum arg) proc = MyProc; MyProc = NULL; DisownLatch(proc-procLatch); +SetLatch(MyLatch); SpinLockAcquire(ProcStructLock); @@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg) proc = MyProc; MyProc = NULL; DisownLatch(proc-procLatch); +SetLatch(MyLatch); SpinLockAcquire(ProcStructLock); These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already sets the latch. (According to the comment in SwitchToSharedLatch() even that should not be necessary.) Ick, that's some debugging leftovers where I was trying to find a bug that was fundamentally caused by the missing barriers in the latch code... 0006: Don't allow immediate interupts during authentication anymore. In commit message: s/interupts/interrupts/. @@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass, if (*shadow_pass == '\0') return STATUS_ERROR;/* empty password */ -/* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */ -ImmediateInterruptOK = true; -/* And don't forget to detect one that already arrived */ CHECK_FOR_INTERRUPTS(); /* That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly placed now that the ImmediateInterruptOK = true is gone. I would just remove it. Add one to the caller if it's needed, but it probably isn't. There's some method to the madness here actually - we just did some database stuff that could in theory take a while... Whether it's worthwhile to leave it here I'm not sure. 2) Does anybody see a significant problem with the reduction of cases where we immediately react to authentication_timeout? Since we still react immediately when we're waiting for the client (except maybe in sspi/kerberos?) I think that's ok. The waiting cases are slow ident/radius/kerberos/ldap/... servers. But we really shouldn't jump out of the relevant code anyway. Yeah, I think that's OK. Doing those things with ImmediateInterruptOK=true was quite scary, and we need to stop doing that. It would be nice to have a wholesale solution for those lookups but I don't see one. So all the ident/radius/kerberos/ldap lookups will need to be done in a non-blocking way to have them react to the timeout. That requires some work, but we can leave that to a followup patch, if someone wants to write it. Ok, cool. Overall, 1-8 look good to me, except for that one incomplete comment in ProcessClientWriteInterrupt() I mentioned in a separate email. Thanks for the review! I plan to push the fixed up versions sometime next week. I haven't reviewed patches 9 and 10 yet. Yea, those aren't as close to being ready (and thus marked WIP). I'd like to do the lwlock stuff for 9.5 because then any sane setup (i.e. unless spinlocks are emulated) doesn't need any semaphores anymore, which makes setup easier. Right now users need to adjust system settings when changing max_connctions upwards or run multiple clusters. But it's not really related to getting rid of ImmediateInterruptOK. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Streaming replication and WAL archive interactions
Hi, On 2014-12-19 22:56:40 +0200, Heikki Linnakangas wrote: This add two new archive_modes, 'shared' and 'always', to indicate whether the WAL archive is shared between the primary and standby, or not. In shared mode, the standby tracks which files have been archived by the primary. The standby refrains from recycling files that the primary has not yet archived, and at failover, the standby archives all those files too from the old timeline. In 'always' mode, the standby's WAL archive is taken to be separate from the primary's, and the standby independently archives all files it receives from the primary. I don't really like this approach. Sharing a archive is rather dangerous in my experience - if your old master comes up again (and writes in the last wal file) or similar, you can get into really bad situations. What I was thinking about was instead trying to detect the point up to which files were safely archived by running restore command to check for the presence of archived files. Then archive anything that has valid content and isn't yet archived. That doesn't sound particularly complicated to me. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers