Re: [HACKERS] File based Incremental backup v8

2015-01-31 Thread Erik Rijkers
On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:

 0001-public-parse_filename_for_nontemp_relation.patch
 0002-copydir-LSN-v2.patch
 0003-File-based-incremental-backup-v8.patch

Hi,

It looks like it only compiles with assert enabled.

This is perhaps not yet really a problem at this stage but I thought I'd 
mention it:

make --quiet -j 8
In file included from gram.y:14403:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:10174:23: warning: unused variable ‘yyg’ [-Wunused-variable]
 struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var may be 
unused depending upon options. */
   ^
basebackup.c: In function ‘writeBackupProfileLine’:
basebackup.c:1545:8: warning: format ‘%lld’ expects argument of type ‘long long 
int’, but argument 8 has type ‘__off_t’
[-Wformat=]
filename);
^
basebackup.c:1545:8: warning: format ‘%lld’ expects argument of type ‘long long 
int’, but argument 8 has type ‘__off_t’
[-Wformat=]
pg_basebackup.c: In function ‘ReceiveTarFile’:
pg_basebackup.c:858:2: warning: implicit declaration of function ‘assert’ 
[-Wimplicit-function-declaration]
  assert(res || (strcmp(basedir, -) == 0));
  ^
pg_basebackup.c:865:2: warning: ISO C90 forbids mixed declarations and code 
[-Wdeclaration-after-statement]
  gzFile  ztarfile = NULL;
  ^
pg_basebackup.o: In function `ReceiveAndUnpackTarFile':
pg_basebackup.c:(.text+0x690): undefined reference to `assert'
pg_basebackup.o: In function `ReceiveTarFile':
pg_basebackup.c:(.text+0xeb0): undefined reference to `assert'
pg_basebackup.c:(.text+0x10ad): undefined reference to `assert'
collect2: error: ld returned 1 exit status
make[3]: *** [pg_basebackup] Error 1
make[3]: *** Waiting for unfinished jobs
make[2]: *** [all-pg_basebackup-recurse] Error 2
make[2]: *** Waiting for unfinished jobs
make[1]: *** [all-bin-recurse] Error 2
make: *** [all-src-recurse] Error 2

The configure used was:
./configure \
 --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \
 
--bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin.fast
 \
 
--libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib.fast
 \
 --with-pgport=6973 --quiet --enable-depend \
 --with-extra-version=_incremental_backup_20150131_1521_08bd0c581158 \
 --with-openssl --with-perl --with-libxml --with-libxslt --with-zlib


A build with --enable-cassert and --enable-debug builds fine:

./configure \
 --prefix=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup \
 --bindir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/bin 
\
 --libdir=/home/aardvark/pg_stuff/pg_installations/pgsql.incremental_backup/lib 
\
 --with-pgport=6973 --quiet --enable-depend \
 --with-extra-version=_incremental_backup_20150131_1628_08bd0c581158 \
 --enable-cassert --enable-debug \
 --with-openssl --with-perl --with-libxml --with-libxslt --with-zlib

I will further test with that.


thanks,

Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [GENERAL] 4B row limit for CLOB tables

2015-01-31 Thread Roger Pack
Oops forgot to forward to the list (suggestion/feature request to the
list admin for the various pg lists: make the default reply to go to
the list, not the sender, if at all possible).

Response below:

On 1/30/15, Jim Nasby jim.na...@bluetreble.com wrote:
 On 1/30/15 11:54 AM, Roger Pack wrote:
 On 1/29/15, Roger Pack rogerdpa...@gmail.com wrote:
 Hello.  I see on this page a mention of basically a 4B row limit for
 tables that have BLOB's

 Oops I meant for BYTEA or TEXT columns, but it's possible the
 reasoning is the same...

 It only applies to large objects, not bytea or text.

 OK I think I figured out possibly why the wiki says this.  I guess
 BYTEA entries  2KB will be autostored via TOAST, which uses an OID in
 its backend.  So BYTEA has a same limitation.  It appears that
 disabling TOAST is not an option [1].
 So I guess if the number of BYTEA entries (in the sum all tables?
 partitioning doesn't help?) with size  2KB is  4 billion then there
 is actually no option there?  If this occurred it might cause all
 sorts of things to break? [2]

 It's a bit more complex than that. First, toast isn't limited to bytea;
 it holds for ALL varlena fields in a table that are allowed to store
 externally. Second, the limit is actually per-table: every table gets
 it's own toast table, and each toast table is limited to 4B unique OIDs.
 Third, the OID counter is actually global, but the code should handle
 conflicts by trying to get another OID. See toast_save_datum(), which
 calls GetNewOidWithIndex().

 Now, the reality is that GetNewOidWithIndex() is going to keep
 incrementing the global OID counter until it finds an OID that isn't in
 the toast table. That means that if you actually get anywhere close to
 using 4B OIDs you're going to become extremely unhappy with the
 performance of toasting new data.

OK so system stability doesn't degrade per se when it wraps [since
they all use that GetNewOid method or similar [?] good to know.

So basically when it gets near 4B TOAST'ed rows it may have to wrap that
counter and search for unused number, and for each number it's
querying the TOAST table to see if it's already used, degrading
performance.

So I guess partitioning tables for now is an acceptable work around,
good to know.

Thanks much for your response, good to know the details before we dive
into postgres with our 8B row table with BYTEA's in it :)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] File based Incremental backup v8

2015-01-31 Thread Marco Nenciarini
Il 29/01/15 18:57, Robert Haas ha scritto:
 On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 The current implementation of copydir function is incompatible with LSN
 based incremental backups. The problem is that new files are created,
 but their blocks are still with the old LSN, so they will not be backed
 up because they are looking old enough.
 
 I think this is trying to pollute what's supposed to be a pure
 fs-level operation (copy a directory) into something that is aware
 of specific details like the PostgreSQL page format.  I really think
 that nothing in storage/file should know about the page format.  If we
 need a function that copies a file while replacing the LSNs, I think
 it should be a new function living somewhere else.

Given that the copydir function is used only during CREATE DATABASE and
ALTER DATABASE SET TABLESPACE, we could move it/renaming it to a better
place that clearly mark it as knowing about page format. I'm open to
suggestions on where to place it an on what should be the correct name.

However the whole copydir patch here should be treated as a temporary
thing. It is necessary until a proper WAL logging of CREATE DATABASE and
ALTER DATABASE SET TABLESPACE will be implemented to support any form of
LSN based incremental backup.

 
 A bigger problem is that you are proposing to stamp those files with
 LSNs that are, for lack of a better word, fake.  I would expect that
 this would completely break if checksums are enabled.

I'm sorry I completely ignored checksums in previous patch. The attached
one works with checksums enabled.

 Also, unlogged relations typically have an LSN of 0; this would
 change that in some cases, and I don't know whether that's OK.


It shouldn't be a problem because all the code that uses unlogged
relations normally skip all the WAL related operations. From the point
of view of an incremental backup it is also not a problem, because
restoring the backup the unlogged tables will get reinitialized because
of crash recovery procedure. However if you think it is worth the
effort, I can rewrite the copydir as a two pass operation detecting the
unlogged tables on the first pass and avoiding the LSN update on
unlogged tables. I personally think that it doesn't wort the effort
unless someone identify a real path where settins LSNs in unlogged
relations leads to an issue.

 The issues here are similar to those in
 http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de
 - basically, I think we need to make CREATE DATABASE and ALTER
 DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
 never going to work right.  If we're not going to allow that, we need
 to disallow hot backups while those operations are in progress.
 

This is right, but the problem Andres reported is orthogonal with the
one I'm addressing here. Without this copydir patch (or without a proper
WAL logging of copydir operations), you cannot take an incremental
backup after a CREATE DATABASE or ALTER DATABASE SET TABLESPACE until
you get a full backup and use it as base.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini marco.nenciar...@2ndquadrant.it
Date: Thu, 29 Jan 2015 12:18:47 +0100
Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation

---
 src/backend/storage/file/reinit.c | 58 ---
 src/common/relpath.c  | 56 +
 src/include/common/relpath.h  |  2 ++
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/src/backend/storage/file/reinit.c 
b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*** static void ResetUnloggedRelationsInTabl
*** 28,35 
  int 
op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
-   int 
*oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 
*** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 
fsync_fname((char *) dbspacedirname, true);
}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer 

Re: [HACKERS] TABLESAMPLE patch

2015-01-31 Thread Amit Kapila
On Fri, Jan 23, 2015 at 5:39 AM, Petr Jelinek p...@2ndquadrant.com wrote:

 On 19/01/15 07:08, Amit Kapila wrote:

 On Sun, Jan 18, 2015 at 12:46 AM, Petr Jelinek p...@2ndquadrant.com
 mailto:p...@2ndquadrant.com wrote:


 I think that's actually good to have, because we still do costing and the
 partial index might help produce better estimate of number of returned rows
 for tablesample as well.


I don't understand how will it help, because for tablesample scan
it doesn't consider partial index at all as per patch.

CREATE TABLE test_tablesample (id INT, name text) WITH (fillfactor=10);
INSERT INTO test_tablesample SELECT i, repeat(i::text, 200) FROM
generate_series(0, 9) s(i) ORDER BY i;
INSERT INTO test_tablesample values(generate_series(10,1);

create index idx_tblsample on test_tablesample(id) where id;

Analyze test_tablesample;

postgres=# explain SELECT id FROM test_tablesample where id  ;
QUERY PLAN


---
 Index Only Scan using idx_tblsample on test_tablesample  (cost=0.13..8.14
rows=
1 width=4)
   Index Cond: (id  )
(2 rows)


postgres=# explain SELECT id FROM test_tablesample TABLESAMPLE BERNOULLI
(80) wh
ere id  ;
   QUERY PLAN

 Sample Scan on test_tablesample  (cost=0.00..658.00 rows=8000 width=4)
   Filter: (id  )
(2 rows)

The above result also clearly indicate that when TABLESAMPLE
clause is used, it won't use partial index.


 Well similar, not same as we are not always fetching whole page or doing
 visibility checks on all tuples in the page, etc. But I don't see why it
 can't be inside nodeSamplescan. If you look at bitmap heap scan, that one
 is also essentially somewhat modified sequential scan and does everything
 within the node nodeBitmapHeapscan because the algorithm is not used
 anywhere else, just like sample scan.


I don't mind doing everything in nodeSamplescan, however if
you can split the function, it will be easier to read and understand,
if you see in nodeBitmapHeapscan, that also has function like
bitgetpage().
This is just a suggestion and if you think that it can be splitted,
then it's okay, otherwise leave it as it is.


 Refer parameter (HeapScanDesc-rs_syncscan) and syncscan.c.
 Basically during sequiantial scan on same table by different
 backends, we attempt to keep them synchronized to reduce the I/O.


 Ah this, yes, it makes sense for bernoulli, not for system though. I guess
 it should be used for sampling methods that use SAS_SEQUENTIAL strategy.


Have you taken care of this in your latest patch?



 I don't know how else to explain, if we loop over every tuple in the
 relation and return it with equal probability then visibility checks don't
 matter as the percentage of visible tuples will be same in the result as in
 the relation.

 For example if you have 30% visible tuples and you are interested in 10%
 of tuples overall it will return 10% of all tuples and since every tuple
 has 10% chance of being returned, 30% of those should be visible (if we
 assume smooth distribution of random numbers generated). So in the end you
 are getting 10% of visible tuples because the scan will obviously skip the
 invisible ones and that's what you wanted.

 As I said problem of analyze is that it uses tuple limit instead of
 probability.


Okay, got the point.



 Yes the differences is smaller (in relative numbers) for bigger tables
 when I test this. On 1k row table the spread of 20 runs was between 770-818
 and on 100k row table it's between 79868-80154. I think that's acceptable
 variance and it's imho indeed the random generator that produces this.


Sure, I think this is acceptable.


 Oh and BTW when I delete 50k of tuples (make them invisible) the results
 of 20 runs are between 30880 and 40063 rows.


This is between 60% to 80%, lower than what is expected,
but I guess we can't do much for this except for trying with
reverse order for visibility test and sample tuple call,
you can decide if you want to try that once just to see if that
is better.


 I am sure it could be done with sequence scan. Not sure if it would be
 pretty and more importantly, the TABLESAMPLE is *not* sequential scan. The
 fact that bernoulli method looks similar should not make us assume that
 every other method does as well, especially when system method is
 completely different.


Okay, lets keep it as separate.



 Anyway, attached is new version with some updates that you mentioned
 (all_visible, etc).
 I also added optional interface for the sampling method to access the
 tuple contents as I can imagine sampling methods that will want to do that.


+ /*
+ * Let the sampling method examine the actual tuple and decide if we
+ * should return it.
+ *
+ * Note that we let it examine even invisible tuples.
+ */
+ if 

Re: [HACKERS] pg_check_dir comments and implementation mismatch

2015-01-31 Thread Marco Nenciarini
Il 30/01/15 03:54, Michael Paquier ha scritto:
 On Fri, Jan 30, 2015 at 2:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 There is at least one other bug in that function now that I look at it:
 in event of a readdir() failure, it neglects to execute closedir().
 Perhaps not too significant since all existing callers will just exit()
 anyway after a failure, but still ...
 I would imagine that code scanners like coverity or similar would not
 be happy about that. ISTM that it is better to closedir()
 appropriately in all the code paths.
 

I've attached a new version of the patch fixing the missing closedir on
readdir error.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From d2bfb6878aed404fdba1447d3f813edb4442ff47 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini marco.nenciar...@2ndquadrant.it
Date: Sat, 31 Jan 2015 14:06:54 +0100
Subject: [PATCH] Improve pg_check_dir comments and code

Update the pg_check_dir comment to explain all the possible return
values (0 to 4).

Fix the beaviour in presence of lost+found directory. The previous
version was returning 3 (mount point) even if the readdir returns
something after the lost+found directory. In this case the output should
be 4 (not empty).

Ensure that closedir are always called even if readdir returns an error.
---
 src/port/pgcheckdir.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/port/pgcheckdir.c b/src/port/pgcheckdir.c
index 7061893..02a048c 100644
*** a/src/port/pgcheckdir.c
--- b/src/port/pgcheckdir.c
***
*** 22,28 
   * Returns:
   *0 if nonexistent
   *1 if exists and empty
!  *2 if exists and not empty
   *-1 if trouble accessing directory (errno reflects the error)
   */
  int
--- 22,30 
   * Returns:
   *0 if nonexistent
   *1 if exists and empty
!  *2 if exists and contains _only_ dot files
!  *3 if exists and contains a mount point
!  *4 if exists and not empty
   *-1 if trouble accessing directory (errno reflects the error)
   */
  int
*** pg_check_dir(const char *dir)
*** 32,37 
--- 34,40 
DIR*chkdir;
struct dirent *file;
booldot_found = false;
+   boolmount_found = false;
  
chkdir = opendir(dir);
if (chkdir == NULL)
*** pg_check_dir(const char *dir)
*** 51,60 
{
dot_found = true;
}
else if (strcmp(lost+found, file-d_name) == 0)
{
!   result = 3; /* not empty, mount 
point */
!   break;
}
  #endif
else
--- 54,63 
{
dot_found = true;
}
+   /* lost+found directory */
else if (strcmp(lost+found, file-d_name) == 0)
{
!   mount_found = true;
}
  #endif
else
*** pg_check_dir(const char *dir)
*** 64,72 
}
}
  
!   if (errno || closedir(chkdir))
result = -1;/* some kind of I/O error? */
  
/* We report on dot-files if we _only_ find dot files */
if (result == 1  dot_found)
result = 2;
--- 67,82 
}
}
  
!   if (errno)
result = -1;/* some kind of I/O error? */
  
+   if (closedir(chkdir))
+   result = -1;/* error executing closedir */
+ 
+   /* We report on mount point if we find a lost+found directory */
+   if (result == 1  mount_found)
+   result = 3;
+ 
/* We report on dot-files if we _only_ find dot files */
if (result == 1  dot_found)
result = 2;
-- 
2.2.2



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2015-01-31 Thread Josh Berkus
On 01/30/2015 10:01 PM, Amit Kapila wrote:
 On Fri, Jan 30, 2015 at 10:58 PM, Robert Haas robertmh...@gmail.com
 mailto:robertmh...@gmail.com wrote:
 Yes.  The contents of postgresql.conf are only mildly order-dependent.
 If you put the same setting in more than once, it matters which one is
 last.  Apart from that, though, it doesn't really matter:
 wal_keep_segments=10 means the same thing if it occurs before
 max_connections=401 that it means after that.  The same is not true of
 pg_hba.conf, where the order matters a lot.  
 
 Do you mean to say that as authentication system uses just the
 first record that matches to perform authentication, it could lead
 to problems if an order is not maintained?  Won't the same
 set of problems can occur if user tries to that manually and do
 it without proper care of such rules.  Now the problem with
 command is that user can't see the order in which entries are
 being made, but it seems to me that we can provide a view or some
 way to user so that the order of entries is visible and the same is
 allowed to be manipulated via command.

We *can*, yes.  But the technical issues around that have not been
addressed.  Certainly just making the new system view respond to
UPDATE/INSERT/DELETE would not be sufficient.

And then once we address the technical issues, we'll need to address the
security implications.

I think this is worth doing; there's some tremendous utility potential
in having a PostgresQL which can be 100% managed via port 5432,
especially for the emerging world of container-based hosting (Docker et.
al.).  However, it's also going to be difficult.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-01-31 Thread Petr Jelinek

On 31/01/15 14:27, Amit Kapila wrote:

On Fri, Jan 23, 2015 at 5:39 AM, Petr Jelinek p...@2ndquadrant.com
mailto:p...@2ndquadrant.com wrote:

On 19/01/15 07:08, Amit Kapila wrote:

On Sun, Jan 18, 2015 at 12:46 AM, Petr Jelinek
p...@2ndquadrant.com mailto:p...@2ndquadrant.com
mailto:p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote:


I think that's actually good to have, because we still do costing
and the partial index might help produce better estimate of number
of returned rows for tablesample as well.


I don't understand how will it help, because for tablesample scan
it doesn't consider partial index at all as per patch.



Oh I think we were talking abut different things then, I thought you 
were talking about the index checks that planner/optimizer sometimes 
does to get more accurate statistics. I'll take another look then.




Well similar, not same as we are not always fetching whole page or
doing visibility checks on all tuples in the page, etc. But I don't
see why it can't be inside nodeSamplescan. If you look at bitmap
heap scan, that one is also essentially somewhat modified sequential
scan and does everything within the node nodeBitmapHeapscan because
the algorithm is not used anywhere else, just like sample scan.


I don't mind doing everything in nodeSamplescan, however if
you can split the function, it will be easier to read and understand,
if you see in nodeBitmapHeapscan, that also has function like
bitgetpage().
This is just a suggestion and if you think that it can be splitted,
then it's okay, otherwise leave it as it is.


Yeah I can split it to separate function within the nodeSamplescan, that 
sounds reasonable.





Refer parameter (HeapScanDesc-rs_syncscan) and syncscan.c.
Basically during sequiantial scan on same table by different
backends, we attempt to keep them synchronized to reduce the I/O.


Ah this, yes, it makes sense for bernoulli, not for system though. I
guess it should be used for sampling methods that use SAS_SEQUENTIAL
strategy.


Have you taken care of this in your latest patch?


Not yet. I think I will need to make the strategy property of the 
sampling method instead of returning it by costing function so that the 
info can be used by the scan.




Oh and BTW when I delete 50k of tuples (make them invisible) the
results of 20 runs are between 30880 and 40063 rows.


This is between 60% to 80%, lower than what is expected,
but I guess we can't do much for this except for trying with
reverse order for visibility test and sample tuple call,
you can decide if you want to try that once just to see if that
is better.



No, that's because I can't write properly, the lower number was supposed 
to be 39880 which is well within the tolerance, sorry for the confusion 
 (9 and 0 are just too close).



Anyway, attached is new version with some updates that you mentioned
(all_visible, etc).
I also added optional interface for the sampling method to access
the tuple contents as I can imagine sampling methods that will want
to do that.

+/*
+* Let the sampling method examine the actual tuple and decide if we
+* should return it.
+*
+* Note that we let it examine even invisible tuples.
+*/
+if (OidIsValid(node-tsmreturntuple.fn_oid))
+{
+found = DatumGetBool(FunctionCall4(node-tsmreturntuple,
+  PointerGetDatum
(node),
+  UInt32GetDatum
(blockno),
+  PointerGetDatum
(tuple),
+  BoolGetDatum
(visible)));
+/* XXX: better error */
+if (found  !visible)
+elog(ERROR, Sampling method wanted to return invisible tuple);
+}

You have mentioned in comment that let it examine invisible tuple,
but it is not clear why you want to allow examining invisible tuple
and then later return error, why can't it skips invisible tuple.



Well I think returning invisible tuples to user is bad idea and that's 
why the check, but I also think it might make sense to examine the 
invisible tuple for the sampling function in case it wants to create 
some kind of stats about the scan and wants to use those for making the 
decision about returning other tuples. The interface should be probably 
called tsmexaminetuple instead to make it more clear what the intention is.



1.
How about statistics (pgstat_count_heap_getnext()) during
SampleNext as we do in heap_getnext?


Right, will add.



2.
+static TupleTableSlot *
+SampleNext(SampleScanState *node)
+{
..
+/*
+* Lock the buffer so we can safely assess tuple
+* visibility later.
+*/
+LockBuffer(buffer, BUFFER_LOCK_SHARE);
..
}

When is this content lock released, shouldn't we release it after
checking visibility of tuple?


Here,
+   if (!OffsetNumberIsValid(tupoffset))
+   {
+   UnlockReleaseBuffer(buffer);

but yes you are correct, it should be just released there and we can 
unlock already after visibility check.




3.
+static TupleTableSlot *
+SampleNext(SampleScanState 

Re: [HACKERS] Re: [COMMITTERS] pgsql: Another attempt at fixing Windows Norwegian locale.

2015-01-31 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 01/16/2015 07:05 PM, Heikki Linnakangas wrote:
 On 01/16/2015 04:17 PM, Tom Lane wrote:
 What instructions do you have in mind to give?

 Ok, I have created a wiki page for these instructions:

 http://wiki.postgresql.org/wiki/Changes_To_Norwegian_Locale

 They can be moved to the release notes, or we can just add a note there 
 with a link to the wiki page. I think the latter would be better. 
 Suggested reference in the release notes:

 Migration to Version X

 If you are a Windows user, using the Norwegian (Bokmål) locale, manual 
 action is needed after the upgrade, to replace the Norwegian 
 (Bokmål)_Norway locale names stored in system catalogs with its 
 pure-ASCII alias, Norwegian_Norway. More information is available at 
 http://wiki.postgresql.org/wiki/Changes_To_Norwegian_Locale

I've looked at this issue a bit now.  I'm good with using essentially this
text in the release notes, but I think the instructions are one brick shy
of a load.  Specifically, you claimed in the second commit that we'd not
made any releases using norwegian-bokmal, but that's utterly wrong:
9.4.0 uses that spelling.  What advice do we need to give 9.4 users?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-31 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I understand Andrew to be saying that if you take a 6-character string
 and convert it to a JSON string and then back to text, you will
 *usually* get back the same 6 characters you started with ... unless
 the first character was \, the second u, and the remainder hexadecimal
 digits.  Then you'll get back a one-character string or an error
 instead.  It's not hard to imagine that leading to surprising
 behavior, or even security vulnerabilities in applications that aren't
 expecting such a translation to happen under them.

That *was* the case, with the now-reverted patch that changed the escaping
rules.  It's not anymore:

regression=# select to_json('\u1234'::text);
  to_json  
---
 \\u1234
(1 row)

When you convert that back to text, you'll get \u1234, no more and no
less.  For example:

regression=# select array_to_json(array['\u1234'::text]);
 array_to_json 
---
 [\\u1234]
(1 row)

regression=# select array_to_json(array['\u1234'::text])-0;
 ?column?  
---
 \\u1234
(1 row)

regression=# select array_to_json(array['\u1234'::text])-0;
 ?column? 
--
 \u1234
(1 row)

Now, if you put in '\u1234'::jsonb and extract that string as text,
you get some Unicode character or other.  But I'd say that a JSON user
who is surprised by that doesn't understand JSON, and definitely that they
hadn't read more than about one paragraph of our description of the JSON
types.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes up for review

2015-01-31 Thread Tom Lane
I wrote:
 First draft of release notes for the upcoming minor releases is committed
 at
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=77e9125e847adf76e9466814781957c0f32d8554
 It should be visible on the documentation website after guaibasaurus does
 its next buildfarm run, a couple hours from now.

BTW, that's up now at

http://www.postgresql.org/docs/devel/static/release-9-4-1.html

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] File based Incremental backup v9

2015-01-31 Thread Marco Nenciarini
Il 31/01/15 17:22, Erik Rijkers ha scritto:
 On Sat, January 31, 2015 15:14, Marco Nenciarini wrote:
 
 0001-public-parse_filename_for_nontemp_relation.patch
 0002-copydir-LSN-v2.patch
 0003-File-based-incremental-backup-v8.patch
 
 Hi,
 
 It looks like it only compiles with assert enabled.
 

It is due to a typo (assert instead of Assert). You can find the updated
patch attached to this message.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From 3e451077283de8e99c4eceb748d49c34329c6ef8 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini marco.nenciar...@2ndquadrant.it
Date: Thu, 29 Jan 2015 12:18:47 +0100
Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation

---
 src/backend/storage/file/reinit.c | 58 ---
 src/common/relpath.c  | 56 +
 src/include/common/relpath.h  |  2 ++
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/src/backend/storage/file/reinit.c 
b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*** static void ResetUnloggedRelationsInTabl
*** 28,35 
  int 
op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
-   int 
*oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 
*** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 
fsync_fname((char *) dbspacedirname, true);
}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
-  * portion of the filename.  This is critical to protect against a possible
-  * buffer overrun.
-  */
- static bool
- parse_filename_for_nontemp_relation(const char *name, int *oidchars,
-   
ForkNumber *fork)
- {
-   int pos;
- 
-   /* Look for a non-empty string of digits (that isn't too long). */
-   for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
-   ;
-   if (pos == 0 || pos  OIDCHARS)
-   return false;
-   *oidchars = pos;
- 
-   /* Check for a fork name. */
-   if (name[pos] != '_')
-   *fork = MAIN_FORKNUM;
-   else
-   {
-   int forkchar;
- 
-   forkchar = forkname_chars(name[pos + 1], fork);
-   if (forkchar = 0)
-   return false;
-   pos += forkchar + 1;
-   }
- 
-   /* Check for a segment number. */
-   if (name[pos] == '.')
-   {
-   int segchar;
- 
-   for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); 
++segchar)
-   ;
-   if (segchar = 1)
-   return false;
-   pos += segchar;
-   }
- 
-   /* Now we should be at the end. */
-   if (name[pos] != '\0')
-   return false;
-   return true;
- }
--- 386,388 
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 66dfef1..83a1e3a 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*** GetRelationPath(Oid dbNode, Oid spcNode,
*** 206,208 
--- 206,264 
}
return path;
  }
+ 
+ /*
+  * Basic parsing of putative relation filenames.
+  *
+  * This function returns true if the file appears to be in the correct format
+  * for a non-temporary relation and false otherwise.
+  *
+  * NB: If this function returns true, the caller is entitled to assume that
+  * *oidchars has been set to the a value no more than OIDCHARS, and thus
+  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
+  * portion of the filename.  This is critical to protect against a possible
+  * buffer overrun.
+  */
+ bool
+ parse_filename_for_nontemp_relation(const char *name, int *oidchars,
+   
ForkNumber *fork)
+ {
+   int pos;
+ 
+   /* Look for a non-empty string of digits (that isn't too long). */
+   for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
+   ;
+   if (pos == 0 || pos 

Re: [HACKERS] POLA violation with \c service=

2015-01-31 Thread Pavel Stehule
Hi all

I am sending a review of this patch:

* What it does? - Allow to connect to other db by \connect uri connection
format

postgres=# \c postgresql://localhost?service=old
psql (9.5devel, server 9.2.9)
You are now connected to database postgres as user pavel.

* Would we this feature? - yes, it eliminate inconsistency between cmd line
connect and \connect. It is good idea without any objections.

* This patch is cleanly applicable, later compilation without any issues

* All regress tests passed

* A psql documentation is updated -- this feature (and format) is not
widely known, so maybe some more  examples are welcome

* When I tested this feature, it worked as expected

* Code respects PostgreSQL coding rules. I prefer a little bit different
test if keep password. Current code is little bit harder to understand. But
I can live with David's code well too.

if
(!user)

user = PQuser(o_conn);

if
(!host)

host =
PQhost(o_conn);


if
(!port)

port =
PQport(o_conn);


if
(dbname)

has_connection_string =
recognized_connection_string(dbname);


   /* we should not to keep password if some connection property is changed
*/


  keep_password = strcmp(user, PQuser(o_conn)) == 0   strcmp(host,
PQhost(o_conn)) == 0
  strcmp(port, PQport(o_conn)) == 0 
!has_connection_string;


I have not any other comments.

Possible questions:
  1. more examples in doc
  2. small change how to check keep_password

Regards

Pavel


2015-01-13 15:00 GMT+01:00 David Fetter da...@fetter.org:

 On Sat, Jan 10, 2015 at 04:41:16PM -0800, David Fetter wrote:
  On Sat, Jan 10, 2015 at 09:30:57AM +0100, Erik Rijkers wrote:
   On Fri, January 9, 2015 20:15, David Fetter wrote:
[psql_fix_uri_service_003.patch]
  
   Applies on master; the feature (switching services) works well but a
 \c without any parameters produces a segfault:
  
   (centos 6.6, 4.9.2, 64-bit)
  
  
   $ echo -en $PGSERVICEFILE\n$PGSERVICE\n$PGPORT\n
   /home/aardvark/.pg_service
   service_pola
   6968
  
   $ psql
   Timing is on.
   psql (9.5devel_service_pola_20150109_2340_ac7009abd228)
   Type help for help.
  
   testdb=# \c service=HEAD
   You are now connected to database testdb as user aardvark via
 socket in /tmp at port 6545.
   testdb=# \c service=service_pola
   You are now connected to database testdb as user aardvark via
 socket in /tmp at port 6968.
   testdb=# \c
   Segmentation fault (core dumped)
 
  Fixed by running that function only if the argument exists.
 
  More C cleanups, too.

 Added to the upcoming commitfest.

 Cheers,
 David.
 --
 David Fetter da...@fetter.org http://fetter.org/
 Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
 Skype: davidfetter  XMPP: david.fet...@gmail.com

 Remember to vote!
 Consider donating to Postgres: http://www.postgresql.org/about/donate


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



[HACKERS] Draft release notes up for review

2015-01-31 Thread Tom Lane
First draft of release notes for the upcoming minor releases is committed
at

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=77e9125e847adf76e9466814781957c0f32d8554

It should be visible on the documentation website after guaibasaurus does
its next buildfarm run, a couple hours from now.

Please let me know of any corrections ASAP.

Note that I've been fairly ruthless about removing items altogether if
they were unlikely to have clear user-visible impact, because the notes
were eye-glazingly long already.  Please mention it if you see any
remaining items that we could dispense with ... or if you think I
underestimated the significance of anything that got dropped.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes up for review

2015-01-31 Thread Tom Lane
Amit Langote amitlangot...@gmail.com writes:
 Perhaps the following two missed?

 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=deadbf4f3324f7b2826cac60dd212dfa1b0084ec

 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cd63c57e5cbfc16239aa6837f8b7043a721cdd28

Hm?  Those are both included ... right next to each other in fact,
starting at about line 275 in release-9.4.sgml as it currently stands.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small doc patch about pg_service.conf

2015-01-31 Thread Peter Eisentraut
On 1/3/15 7:56 PM, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
 On Sat, Jan 03, 2015 at 02:36:50PM -0500, Noah Misch wrote:
 The directory libpq consults is `pg_config --sysconfdir`
 
 I was wrong there.  `pg_config --sysconfig` uses get_etc_path(), which 
 adjusts
 to post-installation moves of the installation tree.  parseServiceInfo() uses
 the build-time SYSCONFDIR directly.
 
 Ugh.  That makes things messy.  Still, we're probably better off
 recommending `pg_config --sysconfdir` than anything else.  I doubt
 that the theoretical ability to do a post-installation move is used
 much, and it's certainly not used by people using a prepackaged build.
 So the recommendation would only be wrong for someone who had done such
 a move manually, and they'd probably know what to do anyway.

At least writing `pg_config --sysconfdir` indicates that it's in an
installation-specific location, whereas hardcoding /etc will create
confusion when it's not actually there.  (Incidentally, we use
/usr/local/pgsql/etc elsewhere in the documentation as a sample location.)

I propose the attached patch.

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index e5cab37..75a4d51 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -6957,7 +6957,7 @@ titleThe Connection Service File/title
at filename~/.pg_service.conf/filename or the location
specified by the environment variable envarPGSERVICEFILE/envar,
or it can be a system-wide file
-   at filename/etc/pg_service.conf/filename or in the directory
+   at filename`pg_config --sysconfdir`/pg_service.conf/filename or in the 
directory
specified by the environment variable
envarPGSYSCONFDIR/envar.  If service definitions with the same
name exist in the user and the system file, the user file takes

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

2015-01-31 Thread Andres Freund
On 2015-01-30 18:59:28 +0100, Heikki Linnakangas wrote:
 On 01/15/2015 03:03 AM, Andres Freund wrote:
 0004: Process 'die' interrupts while reading/writing from the client socket.
 
This is the reason Horiguchi-san started this thread.
 
 +ProcessClientWriteInterrupt(!port-noblock);
 ...
 +/*
 + * ProcessClientWriteInterrupt() - Process interrupts specific to client 
 writes
 + *
 + * This is called just after low-level writes. That might be after the read
 + * finished successfully, or it was interrupted via interrupt. 'blocked' 
 tells
 + * us whether the
 + *
 + * Must preserve errno!
 + */
 +void
 +ProcessClientWriteInterrupt(bool blocked)
 
 You're passing port-noblock as argument, but I thought the argument is
 supposed to mean whether the write would've blocked, i.e. if the write
 buffer was full.

Right.

 port-noblock doesn't mean that. But perhaps I misunderstood this -
 the comment on the 'blocked' argument above is a bit incomplete ;-).

The point here is that we tried the send() and then were interrupted
while waiting for the buffer to become writable. That pretty much
implies that the write buffer is full. The !port-noblock is because we
only are blocked on write if we're doinga blocking send, otherwise we'll
just return control to the caller...

I agree that this could use some more comments ;)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tablespaces inside $PGDATA considered harmful

2015-01-31 Thread Marc Mamin
 it is just as likely they simply are not aware
 of the downsides and the only reason they put it is $PGDATA is that
 it seemed like a logical place to put a directory that is intended to hold
 database data.

Yes, this is the reason why we got in this issue. The name PGDATA is misleading.

 The creators of tablespaces seem to have envisioned their usage as a means
 of pulling in disparate file systems and not simply for namespaces within the 
 main
 filesystem that $PGDATA exists on.

true too. We have a lot of tablespaces. I'd probably won't go that way by now, 
but it still has the advantage to help quickly move parts of the data to  
manage filesystem usage.

 Given all this, it seems like a good idea to at least give a warning
 if somebody tries to create a tablespace instead the data directory.

IMHO the first place to put a warning is within the documentation:
 http://www.postgresql.org/docs/9.4/interactive/manage-ag-tablespaces.html
 and possibly a crosslink in 
http://www.postgresql.org/docs/9.4/interactive/sql-createtablespace.html
 
If this is intended to be back-patched then I'd go with just a warning. If
this is strictly 9.5 material then I'd say that since our own tools behave
badly in the current situation we should simply outright disallow it. 

We have a lot of maintenance scripts that rely on our architecture
($PGDADAT - symlinks - tablespace locations). 
We already made a quick evaluation on how to fix this, but gave it up 
for now due to the work amount.
So please be cautious about disallowing it too abruptly. 
Back-patching a change that disallow our current architecture could prevent us 
to apply minor releases for a while...

regards,

Marc Mamin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Safe memory allocation functions

2015-01-31 Thread Michael Paquier
On Sat, Jan 31, 2015 at 2:58 AM, Robert Haas robertmh...@gmail.com wrote:
 Committed.  I didn't think we really need to expose two separate flags
 for the aligned and unaligned cases, so I ripped that out.  I also
 removed the duplicate documentation of the new constants in the
 function header; having two copies of the documentation, one far
 removed from the constants themselves, is a recipe for them eventually
 getting out of sync.  I also moved the function to what I thought was
 a more logical place in the file, and rearranged the order of tests
 slightly so that, in the common path, we test only whether ret == NULL
 and not anything else.
Thanks a lot!

I have reworked a bit the module used for the tests of the new
extended routine (with new tests for Alloc, AllocZero and AllocHuge as
well) and pushed it here:
https://github.com/michaelpq/pg_plugins/tree/master/mcxtalloc_test
This is just to mention it for the sake of the archives, I cannot
really believe that it should be an in-core module in src/test/modules
as it does allocations of 1GB in environments where there is enough
memory. Note that it contains regression tests with alternate outputs
depending on the environment where they are run (still output may not
be stable depending on where those tests are run, particularly where
memory usage varies a lot).
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)

2015-01-31 Thread Andres Freund
On 2015-01-31 00:52:03 +0100, Heikki Linnakangas wrote:
 On 01/15/2015 03:03 AM, Andres Freund wrote:
 0004: Process 'die' interrupts while reading/writing from the client socket.
 
 + * Check for interrupts here, in addition to 
 secure_write(),
 + * because a interrupted write in secure_raw_write() 
 can possibly
 + * only return to here, not to secure_write().
   */
 +ProcessClientWriteInterrupt(true);
 
 Took me a while to parse that sentence. How about:
 
 Check for interrupts here, in addition to secure_write(), because an
 interrupted write in secure_raw_write() will return here, and we cannot
 return to secure_write() until we've written something.

Yep, that's better.

 0005: Move deadlock and other interrupt handling in proc.c out of signal 
 handlers.
 
I'm surprised how comparatively simple this turned out to be. For
robustness and understanding I think this is a great improvement.
 
New and not reviewed at all. Needs significant review. But works
in my simple testcases.
 
 @@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
  proc = MyProc;
  MyProc = NULL;
  DisownLatch(proc-procLatch);
 +SetLatch(MyLatch);
 
  SpinLockAcquire(ProcStructLock);
 
 @@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
  proc = MyProc;
  MyProc = NULL;
  DisownLatch(proc-procLatch);
 +SetLatch(MyLatch);
 
  SpinLockAcquire(ProcStructLock);
 
 These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already sets
 the latch. (According to the comment in SwitchToSharedLatch() even that
 should not be necessary.)

Ick, that's some debugging leftovers where I was trying to find a bug
that was fundamentally caused by the missing barriers in the latch
code...

 0006: Don't allow immediate interupts during authentication anymore.
 
 In commit message: s/interupts/interrupts/.
 
 @@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char 
 *client_pass,
  if (*shadow_pass == '\0')
  return STATUS_ERROR;/* empty password */
 
 -/* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
 -ImmediateInterruptOK = true;
 -/* And don't forget to detect one that already arrived */
  CHECK_FOR_INTERRUPTS();
 
  /*
 
 That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly placed
 now that the ImmediateInterruptOK = true is gone. I would just remove it.
 Add one to the caller if it's needed, but it probably isn't.

There's some method to the madness here actually - we just did some
database stuff that could in theory take a while... Whether it's
worthwhile to leave it here I'm not sure.

 2) Does anybody see a significant problem with the reduction of cases
 where we immediately react to authentication_timeout? Since we still
 react immediately when we're waiting for the client (except maybe in
 sspi/kerberos?) I think that's ok. The waiting cases are slow
 ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
 out of the relevant code anyway.
 
 Yeah, I think that's OK. Doing those things with ImmediateInterruptOK=true
 was quite scary, and we need to stop doing that. It would be nice to have a
 wholesale solution for those lookups but I don't see one. So all the
 ident/radius/kerberos/ldap lookups will need to be done in a non-blocking
 way to have them react to the timeout. That requires some work, but we can
 leave that to a followup patch, if someone wants to write it.

Ok, cool.

 Overall, 1-8 look good to me, except for that one incomplete comment in
 ProcessClientWriteInterrupt() I mentioned in a separate email.

Thanks for the review! I plan to push the fixed up versions sometime
next week.

 I haven't reviewed patches 9 and 10 yet.

Yea, those aren't as close to being ready (and thus marked WIP).

I'd like to do the lwlock stuff for 9.5 because then any sane setup
(i.e. unless spinlocks are emulated) doesn't need any semaphores
anymore, which makes setup easier. Right now users need to adjust system
settings when changing max_connctions upwards or run multiple
clusters. But it's not really related to getting rid of
ImmediateInterruptOK.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Streaming replication and WAL archive interactions

2015-01-31 Thread Andres Freund
Hi,

On 2014-12-19 22:56:40 +0200, Heikki Linnakangas wrote:
 This add two new archive_modes, 'shared' and 'always', to indicate whether
 the WAL archive is shared between the primary and standby, or not. In
 shared mode, the standby tracks which files have been archived by the
 primary. The standby refrains from recycling files that the primary has
 not yet archived, and at failover, the standby archives all those files too
 from the old timeline. In 'always' mode, the standby's WAL archive is
 taken to be separate from the primary's, and the standby independently
 archives all files it receives from the primary.

I don't really like this approach. Sharing a archive is rather dangerous
in my experience - if your old master comes up again (and writes in the
last wal file) or similar, you can get into really bad situations.

What I was thinking about was instead trying to detect the point up to
which files were safely archived by running restore command to check for
the presence of archived files. Then archive anything that has valid
content and isn't yet archived. That doesn't sound particularly
complicated to me.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers