Re: [HACKERS] Parallel Seq Scan

2015-01-29 Thread Jeff Janes
On Tue, Jan 27, 2015 at 11:08 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 On 01/28/2015 04:16 AM, Robert Haas wrote:

 On Tue, Jan 27, 2015 at 6:00 PM, Robert Haas robertmh...@gmail.com
 wrote:

 Now, when you did what I understand to be the same test on the same
 machine, you got times ranging from 9.1 seconds to 35.4 seconds.
 Clearly, there is some difference between our test setups.  Moreover,
 I'm kind of suspicious about whether your results are actually
 physically possible.  Even in the best case where you somehow had the
 maximum possible amount of data - 64 GB on a 64 GB machine - cached,
 leaving no space for cache duplication between PG and the OS and no
 space for the operating system or postgres itself - the table is 120
 GB, so you've got to read *at least* 56 GB from disk.  Reading 56 GB
 from disk in 9 seconds represents an I/O rate of 6 GB/s. I grant that
 there could be some speedup from issuing I/O requests in parallel
 instead of serially, but that is a 15x speedup over dd, so I am a
 little suspicious that there is some problem with the test setup,
 especially because I cannot reproduce the results.


 So I thought about this a little more, and I realized after some
 poking around that hydra's disk subsystem is actually six disks
 configured in a software RAID5[1].  So one advantage of the
 chunk-by-chunk approach you are proposing is that you might be able to
 get all of the disks chugging away at once, because the data is
 presumably striped across all of them.  Reading one block at a time,
 you'll never have more than 1 or 2 disks going, but if you do
 sequential reads from a bunch of different places in the relation, you
 might manage to get all 6.  So that's something to think about.

 One could imagine an algorithm like this: as long as there are more
 1GB segments remaining than there are workers, each worker tries to
 chug through a separate 1GB segment.  When there are not enough 1GB
 segments remaining for that to work, then they start ganging up on the
 same segments.  That way, you get the benefit of spreading out the I/O
 across multiple files (and thus hopefully multiple members of the RAID
 group) when the data is coming from disk, but you can still keep
 everyone busy until the end, which will be important when the data is
 all in-memory and you're just limited by CPU bandwidth.


 OTOH, spreading the I/O across multiple files is not a good thing, if you
 don't have a RAID setup like that. With a single spindle, you'll just
 induce more seeks.

 Perhaps the OS is smart enough to read in large-enough chunks that the
 occasional seek doesn't hurt much. But then again, why isn't the OS smart
 enough to read in large-enough chunks to take advantage of the RAID even
 when you read just a single file?


In my experience with RAID, it is smart enough to take advantage of that.
If the raid controller detects a sequential access pattern read, it
initiates a read ahead on each disk to pre-position the data it will need
(or at least, the behavior I observe is as-if it did that).  But maybe if
the sequential read is a bunch of random reads from different processes
which just happen to add up to sequential, that confuses the algorithm?

Cheers,

Jeff


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Bruce Momjian
On Wed, Jan 28, 2015 at 09:26:11PM -0800, Josh Berkus wrote:
 3. Check that the replica is not very lagged.  If it is, wait for
 traffic to die down and for it to catch up.

Is this necessary.  It seems quite imprecise too.

 4. Shut down the master using -m fast or -m smart for a clean shutdown.
  It is not necessary to shut down the replicas yet.

We already give instructions on how to shut down the server in the
pg_ugprade docs.

 5. pg_upgrade the master using the --link option.  Do not start the new
 version yet.

Stephen mentioned that --link is not clear in the old docs --- I fixed
that.

 6. create a data directory for the new version on the replica.  This
 directory should be empty; if it was initdb'd by the installation
 package, then delete its contents.

rsync will create this for you.

 10. Start the master, then the replica

I have incorporated all your suggestions in the attached patch.  I also
split items into separate sections as you suggested.  You can read the
end result here:

http://momjian.us/tmp/pgsql/pgupgrade.html

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
new file mode 100644
index 07ca0dc..e25e0d0
*** a/doc/src/sgml/backup.sgml
--- b/doc/src/sgml/backup.sgml
*** tar -cf backup.tar /usr/local/pgsql/data
*** 438,445 
 Another option is to use applicationrsync/ to perform a file
 system backup.  This is done by first running applicationrsync/
 while the database server is running, then shutting down the database
!server just long enough to do a second applicationrsync/.  The
!second applicationrsync/ will be much quicker than the first,
 because it has relatively little data to transfer, and the end result
 will be consistent because the server was down.  This method
 allows a file system backup to be performed with minimal downtime.
--- 438,447 
 Another option is to use applicationrsync/ to perform a file
 system backup.  This is done by first running applicationrsync/
 while the database server is running, then shutting down the database
!server long enough to do an commandrsync --checksum/.
!(option--checksum/ is necessary because commandrsync/ only
!has file modification-time granularity of one second.)  The
!second applicationrsync/ will be quicker than the first,
 because it has relatively little data to transfer, and the end result
 will be consistent because the server was down.  This method
 allows a file system backup to be performed with minimal downtime.
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
new file mode 100644
index e1cd260..a97a393
*** a/doc/src/sgml/pgupgrade.sgml
--- b/doc/src/sgml/pgupgrade.sgml
*** NET STOP postgresql-8.4
*** 315,320 
--- 315,324 
  NET STOP postgresql-9.0
  /programlisting
  /para
+ 
+ para
+  Log-shipping standby servers can remain running until a later step.
+ /para
 /step
  
 step
*** pg_upgrade.exe
*** 399,404 
--- 403,513 
 /step
  
 step
+ titleUpgrade any Log-Shipping Standby Servers/title
+ 
+ para
+  If you have Log-Shipping Standby Servers (xref
+  linkend=warm-standby), follow these steps to upgrade them (before
+  starting any servers):
+ /para
+ 
+ procedure
+ 
+  step
+   titleInstall the new PostgreSQL binaries on standby servers/title
+ 
+   para
+Make sure the new binaries and support files are installed on all
+standby servers.
+   /para
+  /step
+ 
+  step
+   titleMake sure the new standby data directories do emphasisnot/
+   exist/title
+ 
+   para
+Make sure the new standby data directories do emphasisnot/
+exist or are empty.  If applicationinitdb/ was run, delete
+the standby server data directories.
+   /para
+  /step
+ 
+  step
+   titleInstall custom shared object files/title
+ 
+   para
+Install the same custom shared object files on the new standbys
+that you installed in the new master cluster.
+   /para
+  /step
+ 
+  step
+   titleStop standby servers/title
+ 
+   para
+If the standby servers are still running, stop them now using the
+above instructions.
+   /para
+  /step
+ 
+  step
+   titleSave configuration files/title
+ 
+   para
+Save any configuration files from the standbys you need to keep,
+e.g.  filenamepostgresql.conf/, literalrecovery.conf/,
+as these will be overwritten or removed in the next step.
+   /para
+  /step
+ 
+  step
+   titleRun applicationrsync//title
+ 
+   para
+From a directory that is above the old and new database cluster
+directories, run this 

Re: [HACKERS] Parallel Seq Scan

2015-01-29 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes:
 On Tue, Jan 27, 2015 at 11:08 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:
 OTOH, spreading the I/O across multiple files is not a good thing, if you
 don't have a RAID setup like that. With a single spindle, you'll just
 induce more seeks.
 
 Perhaps the OS is smart enough to read in large-enough chunks that the
 occasional seek doesn't hurt much. But then again, why isn't the OS smart
 enough to read in large-enough chunks to take advantage of the RAID even
 when you read just a single file?

 In my experience with RAID, it is smart enough to take advantage of that.
 If the raid controller detects a sequential access pattern read, it
 initiates a read ahead on each disk to pre-position the data it will need
 (or at least, the behavior I observe is as-if it did that).  But maybe if
 the sequential read is a bunch of random reads from different processes
 which just happen to add up to sequential, that confuses the algorithm?

If seqscan detection is being done at the level of the RAID controller,
I rather imagine that the controller would not know which process had
initiated which read anyway.  But if it's being done at the level of the
kernel, it's a whole nother thing, and I bet it *would* matter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-01-29 Thread Bruce Momjian
On Thu, Jan 29, 2015 at 11:08:55AM -0500, Robert Haas wrote:
 On Wed, Jan 28, 2015 at 5:19 AM, Petr Jelinek p...@2ndquadrant.com wrote:
  Yes, that's my view too. I would generally be for that change also and it
  would be worth it if the code was used in more than one place, but as it is
  it seems like it will just add code/complexity for no real benefit. It would
  make sense in case we used sequential scan node instead of the new node as
  Amit also suggested, but I remain unconvinced that mixing sampling and
  sequential scan into single scan node would be a good idea.
 
 Based on previous experience, I expect that any proposal to merge
 those nodes would get shot down by Tom with his laser-guided atomic
 bazooka faster than you can say -1 from me regards tom lane.

Do we get illustrations with that?  ;-)  I want a poster for my wall!

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leak in gingetbitmap

2015-01-29 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 [ assorted GIN leaks ]

 I think we need a more whole-sale approach. I'm thinking of adding a new 
 memory context to contain everything related to the scan keys, which can 
 then be destroyed in whole.

 We haven't heard any complaints about this from users, but I think this 
 deserves to be fixed. Perhaps not worth back-patching however.

+1 to using a context instead of a lot of retail pfrees, and I concur
that we shouldn't back-patch (barring seeing some field complaints).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Bruce Momjian
On Thu, Jan 29, 2015 at 10:21:30AM -0500, Andrew Dunstan wrote:
 
 On 01/29/2015 12:26 AM, Josh Berkus wrote:
 So, for my 2c, I'm on the fence about it.  On the one hand, I agree,
 it's a bit of a complex process to get right.  On the other hand, it's
 far better if we put something out there along the lines of if you
 really want to, this is how to do it than having folks try to fumble
 through to find the correct steps themselves.
 So, here's the correct steps for Bruce, because his current doc does not
 cover all of these.  I really think this should go in as a numbered set
 of steps; the current doc has some steps as steps, and other stuff
 buried in paragraphs.
 
 1. Install the new version binaries on both servers, alongside the old
 version.
 
 2. If not done by the package install, initdb the new version's data
 directory.
 
 3. Check that the replica is not very lagged.  If it is, wait for
 traffic to die down and for it to catch up.
 
 4. Shut down the master using -m fast or -m smart for a clean shutdown.
   It is not necessary to shut down the replicas yet.
 
 5. pg_upgrade the master using the --link option.  Do not start the new
 version yet.
 
 6. create a data directory for the new version on the replica.  This
 directory should be empty; if it was initdb'd by the installation
 package, then delete its contents.
 
 7. shut down postgres on the replica.
 
 8. rsync both the old and new data directories from the master to the
 replica, using the --size-only and -H hard links options.  For example,
 if both 9.3 and 9.4 are in /var/lib/postgresql, do:
 
 rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/
 replica-host:/var/lib/postgresql/
 
 9. Create a recovery.conf file in the replica's data directory with the
 appropriate parameters.
 
 10. Start the master, then the replica
 
 
 
 I find steps 2 and 6 confusing.

For number 2, he is creating a new cluster on the master server.  For
#6, he is just creating an empty data directory, though this is not
required as rsync will create the directory for you.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the stats snapshot timestamp to SQL

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 12:18 AM, Matt Kelly mkell...@gmail.com wrote:
 In a previous thread Tom Lane said:

 (I'm also wondering if it'd make sense to expose the stats timestamp
 as a callable function, so that the case could be dealt with
 programmatically as well.  But that's future-feature territory.)


 (http://www.postgresql.org/message-id/27251.1421684...@sss.pgh.pa.us)

 It seemed the appropriate scope for my first submission, and that feature
 has been on my wish list for a while, so I thought I'd grab it.

 Main purpose of this patch is to expose the timestamp of the current stats
 snapshot so that it can be leveraged by monitoring code.  The obvious case
 is alerting if the stats collector becomes unresponsive.  The common use
 case is to smooth out graphs which are built from high frequency monitoring
 of the stats collector.

 The timestamp is already available as part of PgStat_GlobalStats.  This
 patch is just the boilerplate (+docs  tests) needed to expose that value to
 SQL.  It shouldn't impact anything else in the server.

 I'm not particularly attached to the function name, but I didn't have a
 better idea.

 The patch should apply cleanly to master.

Please add your patch here so we don't forget about it:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Bruce Momjian
On Tue, Jan 27, 2015 at 10:16:48PM -0500, David Steele wrote:
 This is definitely an edge case.  Not only does the file have to be
 modified in the same second *after* rsync has done the copy, but the
 file also has to not be modified in *any other subsequent second* before
 the next incremental backup.  If the file is busy enough to have a
 collision with rsync in that second, then it is very likely to be
 modified before the next incremental backup which is generally a day or
 so later.  And, of course, the backup where the issue occurs is fine -
 it's the next backup that is invalid.
 
 However, the hot/cold backup scheme as documented does make the race
 condition more likely since the two backups are done in close proximity
 temporally.  Ultimately, the most reliable method is to use checksums.
 
 For me the biggest issue is that there is no way to discover if a db in
 consistent no matter how much time/resources you are willing to spend. 
 I could live with the idea of the occasional bad backup (since I keep as
 many as possible), but having no way to know whether it is good or not
 is very frustrating.  I know data checksums are a step in that
 direction, but they are a long way from providing the optimal solution. 
 I've implemented rigorous checksums in PgBackRest but something closer
 to the source would be even better.

Agreed.  I have update the two mentions of rsync in our docs to clarify
this.  Thank you.

The patch also has pg_upgrade doc improvements suggested by comments
from Josh Berkus.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
new file mode 100644
index 07ca0dc..e25e0d0
*** a/doc/src/sgml/backup.sgml
--- b/doc/src/sgml/backup.sgml
*** tar -cf backup.tar /usr/local/pgsql/data
*** 438,445 
 Another option is to use applicationrsync/ to perform a file
 system backup.  This is done by first running applicationrsync/
 while the database server is running, then shutting down the database
!server just long enough to do a second applicationrsync/.  The
!second applicationrsync/ will be much quicker than the first,
 because it has relatively little data to transfer, and the end result
 will be consistent because the server was down.  This method
 allows a file system backup to be performed with minimal downtime.
--- 438,447 
 Another option is to use applicationrsync/ to perform a file
 system backup.  This is done by first running applicationrsync/
 while the database server is running, then shutting down the database
!server long enough to do an commandrsync --checksum/.
!(option--checksum/ is necessary because commandrsync/ only
!has file modification-time granularity of one second.)  The
!second applicationrsync/ will be quicker than the first,
 because it has relatively little data to transfer, and the end result
 will be consistent because the server was down.  This method
 allows a file system backup to be performed with minimal downtime.
diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml
new file mode 100644
index e1cd260..ed65def
*** a/doc/src/sgml/pgupgrade.sgml
--- b/doc/src/sgml/pgupgrade.sgml
*** pg_upgrade.exe
*** 409,414 
--- 409,504 
 /step
  
 step
+ titleUpgrade any Log-Shipping Standby Servers/title
+ 
+ para
+  If you have Log-Shipping Standby Servers (xref
+  linkend=warm-standby), follow these steps to upgrade them (before
+  starting any servers):
+ /para
+ 
+ procedure
+ 
+  step
+   titleInstall the new PostgreSQL binaries on standby servers/title
+ 
+   para
+Make sure the new binaries and support files are installed
+on all the standby servers.  Do emphasisnot/ run
+applicationinitdb/.  If applicationinitdb/ was run, delete
+the standby server data directories.  Also, install any custom
+shared object files on the new standbys that you installed in the
+new master cluster.
+   /para
+  /step
+ 
+  step
+   titleShutdown the Standby Servers/title
+ 
+   para
+If the standby servers are still running, shut them down.  Save any
+configuration files from the standbys you need to keep, e.g.
+filenamepostgresql.conf/, literalrecovery.conf/, as these
+will be overwritten or removed in the next step.
+   /para
+  /step
+ 
+  step
+   titleRun applicationrsync//title
+ 
+   para
+From a directory that is above the old and new database cluster
+directories, run this for each slave:
+ 
+ programlisting
+rsync --archive --hard-links --size-only old_pgdata new_pgdata remote_dir
+ /programlisting
+ 
+where optionold_pgdata/ and optionnew_pgdata/ are relative
+to 

Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Andrew Dunstan


On 01/29/2015 12:26 AM, Josh Berkus wrote:

So, for my 2c, I'm on the fence about it.  On the one hand, I agree,
it's a bit of a complex process to get right.  On the other hand, it's
far better if we put something out there along the lines of if you
really want to, this is how to do it than having folks try to fumble
through to find the correct steps themselves.

So, here's the correct steps for Bruce, because his current doc does not
cover all of these.  I really think this should go in as a numbered set
of steps; the current doc has some steps as steps, and other stuff
buried in paragraphs.

1. Install the new version binaries on both servers, alongside the old
version.

2. If not done by the package install, initdb the new version's data
directory.

3. Check that the replica is not very lagged.  If it is, wait for
traffic to die down and for it to catch up.

4. Shut down the master using -m fast or -m smart for a clean shutdown.
  It is not necessary to shut down the replicas yet.

5. pg_upgrade the master using the --link option.  Do not start the new
version yet.

6. create a data directory for the new version on the replica.  This
directory should be empty; if it was initdb'd by the installation
package, then delete its contents.

7. shut down postgres on the replica.

8. rsync both the old and new data directories from the master to the
replica, using the --size-only and -H hard links options.  For example,
if both 9.3 and 9.4 are in /var/lib/postgresql, do:

rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/
replica-host:/var/lib/postgresql/

9. Create a recovery.conf file in the replica's data directory with the
appropriate parameters.

10. Start the master, then the replica




I find steps 2 and 6 confusing.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 11:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 In my experience with RAID, it is smart enough to take advantage of that.
 If the raid controller detects a sequential access pattern read, it
 initiates a read ahead on each disk to pre-position the data it will need
 (or at least, the behavior I observe is as-if it did that).  But maybe if
 the sequential read is a bunch of random reads from different processes
 which just happen to add up to sequential, that confuses the algorithm?

 If seqscan detection is being done at the level of the RAID controller,
 I rather imagine that the controller would not know which process had
 initiated which read anyway.  But if it's being done at the level of the
 kernel, it's a whole nother thing, and I bet it *would* matter.

That was my feeling too.  On the machine that Amit and I have been
using for testing, we can't find any really convincing evidence that
it matters.  I won't be a bit surprised if there are other systems
where it does matter, but I don't know how to find them except to
encourage other people to help test.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Andrew Dunstan


On 01/29/2015 11:34 AM, Bruce Momjian wrote:

On Thu, Jan 29, 2015 at 10:21:30AM -0500, Andrew Dunstan wrote:

On 01/29/2015 12:26 AM, Josh Berkus wrote:

So, for my 2c, I'm on the fence about it.  On the one hand, I agree,
it's a bit of a complex process to get right.  On the other hand, it's
far better if we put something out there along the lines of if you
really want to, this is how to do it than having folks try to fumble
through to find the correct steps themselves.

So, here's the correct steps for Bruce, because his current doc does not
cover all of these.  I really think this should go in as a numbered set
of steps; the current doc has some steps as steps, and other stuff
buried in paragraphs.

1. Install the new version binaries on both servers, alongside the old
version.

2. If not done by the package install, initdb the new version's data
directory.

3. Check that the replica is not very lagged.  If it is, wait for
traffic to die down and for it to catch up.

4. Shut down the master using -m fast or -m smart for a clean shutdown.
  It is not necessary to shut down the replicas yet.

5. pg_upgrade the master using the --link option.  Do not start the new
version yet.

6. create a data directory for the new version on the replica.  This
directory should be empty; if it was initdb'd by the installation
package, then delete its contents.

7. shut down postgres on the replica.

8. rsync both the old and new data directories from the master to the
replica, using the --size-only and -H hard links options.  For example,
if both 9.3 and 9.4 are in /var/lib/postgresql, do:

rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/
replica-host:/var/lib/postgresql/

9. Create a recovery.conf file in the replica's data directory with the
appropriate parameters.

10. Start the master, then the replica



I find steps 2 and 6 confusing.

For number 2, he is creating a new cluster on the master server.  For
#6, he is just creating an empty data directory, though this is not
required as rsync will create the directory for you.




Then step 2 should specify that it's for the master.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Robert Haas
On Wed, Jan 28, 2015 at 12:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The cause of this bug is commit 0ad1a816320a2b539a51628e2a0b1e83ff096b1d,
 which I'm inclined to think we need to simply revert, not render even
 more squirrely.

Yes, that commit looks broken.  If you convert from text to JSON you
should get a JSON string equivalent to the text you started out with.
That commit departs from that in favor of allowing \u sequences in
the text being converted to turn into a single character (or perhaps
an encoding error) after a text-json-text roundtrip.  Maybe I
haven't had enough caffeine today, but I can't see how that can
possibly be a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Bruce Momjian
On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote:
 7. shut down postgres on the replica.
 
 8. rsync both the old and new data directories from the master to the
 replica, using the --size-only and -H hard links options.  For example,
 if both 9.3 and 9.4 are in /var/lib/postgresql, do:
 
 rsync -aHv --size-only -e ssh --itemize-changes /var/lib/postgresql/
 replica-host:/var/lib/postgresql/
 
 9. Create a recovery.conf file in the replica's data directory with the
 appropriate parameters.
 
 10. Start the master, then the replica
 
 
 I find steps 2 and 6 confusing.
 For number 2, he is creating a new cluster on the master server.  For
 #6, he is just creating an empty data directory, though this is not
 required as rsync will create the directory for you.
 
 
 
 Then step 2 should specify that it's for the master.

Right.  Josh is just listing all the steps --- the pg_upgrade docs
already have that spelled out in detail.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2015-01-29 Thread Pavel Stehule
2015-01-29 10:28 GMT+01:00 Fabrízio de Royes Mello fabriziome...@gmail.com
:



 Em quinta-feira, 29 de janeiro de 2015, Pavel Stehule 
 pavel.steh...@gmail.com escreveu:

 Hi

 I am testing this feature on relative complex schema (38619 tables in db)
 and I got deadlock

 [pavel@localhost bin]$ /usr/local/pgsql/bin/vacuumdb test2 -fz -j 4
 vacuumdb: vacuuming database test2
 vacuumdb: vacuuming of database test2 failed: ERROR:  deadlock detected
 DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of
 database 194769; blocked by process 24690.
 Process 24690 waits for AccessShareLock on relation 1249 of database
 194769; blocked by process 24689.
 HINT:  See server log for query details.

 ERROR:  deadlock detected
 DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of
 database 194769; blocked by process 24690.
 Process 24690 waits for AccessShareLock on relation 1249 of
 database 194769; blocked by process 24689.
 Process 24689: VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
 Process 24690: VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
 HINT:  See server log for query details.
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
 ERROR:  canceling statement due to user request
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
 ERROR:  canceling statement due to user request
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_class;
 ERROR:  canceling statement due to user request
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
 LOG:  could not send data to client: Broken pipe
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
 FATAL:  connection to client lost
 LOG:  could not send data to client: Broken pipe
 ERROR:  canceling statement due to user request
 FATAL:  connection to client lost

Schema   |  Name   | Type  |  Owner   |Size|
 Description

 +-+---+--++-
  pg_catalog | pg_attribute| table | postgres | 439 MB |
  pg_catalog | pg_rewrite  | table | postgres | 314 MB |
  pg_catalog | pg_proc | table | postgres | 136 MB |
  pg_catalog | pg_depend   | table | postgres | 133 MB |
  pg_catalog | pg_class| table | postgres | 69 MB  |
  pg_catalog | pg_attrdef  | table | postgres | 55 MB  |
  pg_catalog | pg_trigger  | table | postgres | 47 MB  |
  pg_catalog | pg_type | table | postgres | 31 MB  |
  pg_catalog | pg_description  | table | postgres | 23 MB  |
  pg_catalog | pg_index| table | postgres | 20 MB  |
  pg_catalog | pg_constraint   | table | postgres | 17 MB  |
  pg_catalog | pg_shdepend | table | postgres | 17 MB  |
  pg_catalog | pg_statistic| table | postgres | 928 kB |
  pg_catalog | pg_operator | table | postgres | 552 kB |
  pg_catalog | pg_collation| table | postgres | 232 kB |





should not be used a pessimist controlled locking instead?

Regards

Pavel



 Regards,

 Fabrízio



 --
 Fabrízio de Royes Mello
 Consultoria/Coaching PostgreSQL
  Timbira: http://www.timbira.com.br
  Blog: http://fabriziomello.github.io
  Linkedin: http://br.linkedin.com/in/fabriziomello
  Twitter: http://twitter.com/fabriziomello
  Github: http://github.com/fabriziomello




Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-29 Thread Dean Rasheed
On 29 January 2015 at 04:00, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Jan 7, 2015 at 3:06 PM, Stephen Frost sfr...@snowman.net wrote:
  If I'm following correctly, Peter's specifically talking about:
 
  [ USING ( replaceable class=parameterexpression/replaceable ) ]
  [ WITH CHECK ( replaceable 
  class=parametercheck_expression/replaceable ) ]
 
  Where the USING parameter is 'expression' but the WITH CHECK parameter
  is 'check_expression'.  He makes a good point, I believe, as
  expression is overly generic.  I don't like the idea of using
  barrier_expression though as that ends up introducing a new term- how
  about using_expression?

 I've gone ahead and made this minor change.


Don't forget the ALTER POLICY page. This and some of the other things
being discussed on this thread ought to be copied there too.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2015-01-29 Thread Fabrízio de Royes Mello
Em quinta-feira, 29 de janeiro de 2015, Pavel Stehule 
pavel.steh...@gmail.com escreveu:

 Hi

 I am testing this feature on relative complex schema (38619 tables in db)
 and I got deadlock

 [pavel@localhost bin]$ /usr/local/pgsql/bin/vacuumdb test2 -fz -j 4
 vacuumdb: vacuuming database test2
 vacuumdb: vacuuming of database test2 failed: ERROR:  deadlock detected
 DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of
 database 194769; blocked by process 24690.
 Process 24690 waits for AccessShareLock on relation 1249 of database
 194769; blocked by process 24689.
 HINT:  See server log for query details.

 ERROR:  deadlock detected
 DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of
 database 194769; blocked by process 24690.
 Process 24690 waits for AccessShareLock on relation 1249 of
 database 194769; blocked by process 24689.
 Process 24689: VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
 Process 24690: VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
 HINT:  See server log for query details.
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
 ERROR:  canceling statement due to user request
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
 ERROR:  canceling statement due to user request
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_class;
 ERROR:  canceling statement due to user request
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
 LOG:  could not send data to client: Broken pipe
 STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
 FATAL:  connection to client lost
 LOG:  could not send data to client: Broken pipe
 ERROR:  canceling statement due to user request
 FATAL:  connection to client lost

Schema   |  Name   | Type  |  Owner   |Size|
 Description

 +-+---+--++-
  pg_catalog | pg_attribute| table | postgres | 439 MB |
  pg_catalog | pg_rewrite  | table | postgres | 314 MB |
  pg_catalog | pg_proc | table | postgres | 136 MB |
  pg_catalog | pg_depend   | table | postgres | 133 MB |
  pg_catalog | pg_class| table | postgres | 69 MB  |
  pg_catalog | pg_attrdef  | table | postgres | 55 MB  |
  pg_catalog | pg_trigger  | table | postgres | 47 MB  |
  pg_catalog | pg_type | table | postgres | 31 MB  |
  pg_catalog | pg_description  | table | postgres | 23 MB  |
  pg_catalog | pg_index| table | postgres | 20 MB  |
  pg_catalog | pg_constraint   | table | postgres | 17 MB  |
  pg_catalog | pg_shdepend | table | postgres | 17 MB  |
  pg_catalog | pg_statistic| table | postgres | 928 kB |
  pg_catalog | pg_operator | table | postgres | 552 kB |
  pg_catalog | pg_collation| table | postgres | 232 kB |



There are a warning in the docs to be careful to use the -f (full) option
and -j.

Regards,

Fabrízio



-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


[HACKERS] small postgresql.conf.sample bugfix

2015-01-29 Thread Pavel Stehule
Hello

a max_locks_per_transaction is twice times in postgresql.conf.sample

Attached simple bugfix

Regards

Pavel
commit 89ee480408e63fc39fb6b7a23871c4f46438b7a2
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Thu Jan 29 12:21:07 2015 +0100

remove twice max_locks_per_transaction variable in postgresql.conf.sample

diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b053659..5f1a0da 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -562,8 +562,6 @@
 #--
 
 #deadlock_timeout = 1s
-#max_locks_per_transaction = 64		# min 10
-	# (change requires restart)
 # Note:  Each lock table slot uses ~270 bytes of shared memory, and there are
 # max_locks_per_transaction * (max_connections + max_prepared_transactions)
 # lock table slots.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small postgresql.conf.sample bugfix

2015-01-29 Thread Thom Brown
On 29 January 2015 at 11:35, Pavel Stehule pavel.steh...@gmail.com wrote:

 Hello

 a max_locks_per_transaction is twice times in postgresql.conf.sample

 Attached simple bugfix


I only see one entry, which is the one you're removing.  Where's the other?

Thom


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2015-01-29 Thread Fabrízio de Royes Mello
On Wed, Jan 28, 2015 at 5:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  But I'm thinking about this patch and would not be interesting to have a
  FDW to manipulate the hba file? Imagine if we are able to manipulate the
  HBA file using INSERT/UPDATE/DELETE.

 Since the HBA file is fundamentally order-dependent, while SQL tables
 are fundamentally not, that doesn't seem like a great API match.  You
 could probably brute-force something that would work, but it would very
 much be a case of using a hammer to solve a screwdriver problem.


Maybe, but my intention is provide an easy way to edit HBA entries. With an
extension or API to edit HBA entries many developers of PostgreSQL tools
(ie. pgadmin, phppgadmin, etc) will be benefited.

Perhaps a fdw can't be the best choice, maybe a complete new SQL syntax to
manipulate HBA entries like we did with ALTER SYSTEM. It's just some
thoughts about it.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello


Re: [HACKERS] Safe memory allocation functions

2015-01-29 Thread Robert Haas
On Wed, Jan 28, 2015 at 9:34 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 As a result of all the comments on this thread, here are 3 patches
 implementing incrementally the different ideas from everybody:
 1) 0001 modifies aset.c to return unconditionally NULL in case of an
 OOM instead of reporting an error. All the OOM error reports are moved
 to mcxt.c (MemoryContextAlloc* and palloc*)

This seems like a good idea, but I think it's pretty clear that the
MemoryContextStats(TopMemoryContext) calls ought to move along with
the OOM error report.  The stats are basically another kind of
error-case output, and the whole point here is that the caller wants
to have control of what happens when malloc fails.  Committed that
way.

 2) 0002 adds the noerror routines for frontend and backend.

We don't have consensus on this name; as I read it, Andres and I are
both strongly opposed to it.  Instead of continuing to litigate that
point, I'd like to propose that we just leave this out.  We are
unlikely to have so many callers who need the no-oom-error behavior to
justify adding a bunch of convenience functions --- and if that does
happen, we can resume arguing about the naming then.  For now, let's
just say that if you want that behavior, you should use
MemoryContextAllocExtended(CurrentMemoryContext, ...).

 3) 0003 adds MemoryContextAllocExtended that can be called with the
 following control flags:
 #define ALLOC_HUGE 0x01/* huge allocation */
 #define ALLOC_ZERO 0x02/* clear allocated memory */
 #define ALLOC_NO_OOM   0x04/* no failure if out-of-memory */
 #define ALLOC_ALIGNED  0x08/* request length suitable for MemSetLoop */
 This groups MemoryContextAlloc, MemoryContextAllocHuge,
 MemoryContextAllocZero and MemoryContextAllocZeroAligned under the
 same central routine.

I recommend we leave the existing MemoryContextAlloc* functions alone
and add a new MemoryContextAllocExtended() function *in addition*.  I
think the reason we have multiple copies of this code is because they
are sufficiently hot to make the effect of a few extra CPU
instructions potentially material.  By having separate copies of the
code, we avoid introducing extra branches.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Noah Misch
On Wed, Jan 28, 2015 at 12:48:45PM -0500, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Tue, Jan 27, 2015 at 03:56:22PM -0500, Tom Lane wrote:
  So at this point I propose that we reject \u when de-escaping JSON.
 
  I would have agreed on 2014-12-09, and this release is the last chance to 
  make
  such a change.  It is a bold wager that could pay off, but -1 from me 
  anyway.
 
 You only get to vote -1 if you have a credible alternative.  I don't see
 one.

I don't love json enough to keep participating in a thread where you dismiss
patches and comments from Andrew and myself as quite useless, utter
hogwash and non-credible.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small postgresql.conf.sample bugfix

2015-01-29 Thread Pavel Stehule
2015-01-29 12:52 GMT+01:00 Thom Brown t...@linux.com:

 On 29 January 2015 at 11:35, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 Hello

 a max_locks_per_transaction is twice times in postgresql.conf.sample

 Attached simple bugfix


 I only see one entry, which is the one you're removing.  Where's the other?


You have true. I am blind.

I am sorry for noise

I had max_locks_per_transaction and max_pred_locks_per_transaction as same
strings.

Regards

Pavel


 Thom



Re: [HACKERS] Parallel Seq Scan

2015-01-29 Thread Amit Kapila
On Wed, Jan 28, 2015 at 8:59 PM, Amit Kapila amit.kapil...@gmail.com
wrote:

 I have tried the tests again and found that I have forgotten to increase
 max_worker_processes due to which the data is so different.  Basically
 at higher client count it is just scanning lesser number of blocks in
 fixed chunk approach.  So today I again tried with changing
 max_worker_processes and found that there is not much difference in
 performance at higher client count.  I will take some more data for
 both block_by_block and fixed_chunk approach and repost the data.


I have again taken the data and found that there is not much difference
either between block-by-block or fixed_chuck approach, the data is at
end of mail for your reference.  There is variation in some cases like in
fixed_chunk approach, in 8 workers case it is showing lesser time, however
on certain executions it has taken almost the same time as other workers.

Now if we go with block-by-block approach then we have advantage that
the work distribution granularity will be smaller and hence better and if
we go with chunk-by-chunk (fixed_chunk of 1GB) approach, then there
is good chance that kernel can do the better optimization for reading it.

Based on inputs on this thread, one way for execution strategy could
be:

a. In optimizer, based on effective_io_concurrency, size of relation and
parallel_seqscan_degree, we can decide how many workers can be
used for executing the plan
- choose the number_of_workers equal to effective_io_concurrency,
  if it is less than parallel_seqscan_degree, else number_of_workers
  will be equal to parallel_seqscan_degree.
- if the size of relation is greater than number_of_workers times GB
  (if number_of_workers is 8, then we need to compare the size of
   relation with 8GB), then keep number_of_workers intact and distribute
  the remaining chunks/segments during execution, else
  reduce the number_of_workers such that each worker gets 1GB
  to operate.
- if the size of relation is less than 1GB, then we can either not
  choose the parallel_seqscan at all or could use smaller chunks
  or could use block-by-block approach to execute.
- here we need to consider other parameters like parallel_setup
  parallel_startup and tuple_communication cost as well.

b. In executor, if less workers are available than what are required
for statement execution, then we can redistribute the remaining
work among workers.


Performance Data - Before first run of each worker, I have executed
drop_caches to clear the cache and restarted the server, so we can
assume that except Run-1, all other runs have some caching effect.


  *Fixed-Chunks*



 *No. of workers/Time (ms)* 0 8 16 32  Run-1 322822 245759 330097 330002
Run-2 275685 275428 301625 286251  Run-3 252129 244167 303494 278604  Run-4
252528 259273 250438 258636  Run-5 250612 242072 235384 265918




 *Block-By-Block*



 *No. of workers/Time (ms)* 0 8 16 32  Run-1 323084 341950 338999 334100
Run-2 310968 349366 344272 322643  Run-3 250312 336227 346276 322274  Run-4
262744 314489 351652 325135  Run-5 265987 316260 342924 319200


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Bruce Momjian
On Wed, Jan 28, 2015 at 03:13:47PM -0600, Jim Nasby wrote:
 While I sympathize with Noah's sentiments, the only thing that makes
 sense to me is that a JSON text field is treated the same way as we
 treat text. Right now, that means NUL is not allowed, period.

+1

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-29 Thread Robert Haas
On Wed, Jan 28, 2015 at 10:45 PM, Stephen Frost sfr...@snowman.net wrote:
 I agree, especially after going back and re-reading this while fixing
 the issue mentioned earlier by Peter (which was an orthogonal complaint
 about the shadowing of WITH CHECK by USING, if WITH CHECK isn't
 specified).  We really need a paragraph on USING policies and another
 on WITH CHECK policies.  How about a reword along these lines:

   When row level security is enabled on a table, all access to that
   table by users other than the owner or a superuser must be through a
   policy.  This requirement applies to both selecting out existing rows
   from the table and to adding rows to the table (through either INSERT
   or UPDATE).

   Granting access to existing rows in a table is done by specifying a
   USING expression which will be added to queries which reference the
   table.  Every row in the table which a USING expression returns true
   will be visible.

   Granting access to add rows to a table is done by specifying a WITH
   CHECK expressison.  A WITH CHECK expression must return true for
   every row being added to the table or an error will be returned and
   the command will be aborted.

I hate to be a critic, but this existing sentence in the documentation
seems to explain nearly all of the above: A policy limits the ability
to SELECT, INSERT, UPDATE, or DELETE rows in a table to those rows
which match the relevant policy expression. Existing table rows are
checked against the expression specified via USING, while new rows
that would be created via INSERT or UPDATE are checked against the
expression specified via WITH CHECK.  The only thing I can see we
might need to add is a sentence that says something like If the USING
clause returns false for a particular row, that row will not be
visible to the user; if a WITH CHECK expression does not return true,
an error occurs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-29 Thread Stephen Frost
* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
 On 29 January 2015 at 04:00, Stephen Frost sfr...@snowman.net wrote:
  * Robert Haas (robertmh...@gmail.com) wrote:
  On Wed, Jan 7, 2015 at 3:06 PM, Stephen Frost sfr...@snowman.net wrote:
   If I'm following correctly, Peter's specifically talking about:
  
   [ USING ( replaceable class=parameterexpression/replaceable ) ]
   [ WITH CHECK ( replaceable 
   class=parametercheck_expression/replaceable ) ]
  
   Where the USING parameter is 'expression' but the WITH CHECK parameter
   is 'check_expression'.  He makes a good point, I believe, as
   expression is overly generic.  I don't like the idea of using
   barrier_expression though as that ends up introducing a new term- how
   about using_expression?
 
  I've gone ahead and made this minor change.
 
 Don't forget the ALTER POLICY page. This and some of the other things
 being discussed on this thread ought to be copied there too.

Ah, yeah, good point, will address soon.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_check_dir comments and implementation mismatch

2015-01-29 Thread Marco Nenciarini
Il 29/01/15 18:37, Robert Haas ha scritto:
 On Thu, Jan 29, 2015 at 11:00 AM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 reading pgcheckdir.c code I noticed that the function comment
 was outdated. The code now can return values from 0 to 4 while the
 comment explains only values 0,1,2.
 
 This is not just a comment fix; you are clearly changing the behavior
 of the function in some way.
 

The previous version was returning 3 (mount point) even if the dir
contains something after the lost+found directory. I think this case
deserves a 4 output. For example:

lost+found
zzz.txt

give the result 3.

If the directory contains

aaa.txt
lost+found

the result is instead 4.

This doesn't make much difference, as 3 and 4 are both error condition
for all the callers, but the current behavior looks odd to me, and
surely is not what one can expect reading the comments.

My version returns 3 only if the lost+found file is alone in the
directory (eventually ignoring dot files along it, as it was doing before)

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] pg_check_dir comments and implementation mismatch

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 11:00 AM, Marco Nenciarini
marco.nenciar...@2ndquadrant.it wrote:
 reading pgcheckdir.c code I noticed that the function comment
 was outdated. The code now can return values from 0 to 4 while the
 comment explains only values 0,1,2.

This is not just a comment fix; you are clearly changing the behavior
of the function in some way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_check_dir comments and implementation mismatch

2015-01-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 29, 2015 at 11:00 AM, Marco Nenciarini
 marco.nenciar...@2ndquadrant.it wrote:
 reading pgcheckdir.c code I noticed that the function comment
 was outdated. The code now can return values from 0 to 4 while the
 comment explains only values 0,1,2.

 This is not just a comment fix; you are clearly changing the behavior
 of the function in some way.

I think he's trying to fix a bug in terms of slipshod definition of the
non-empty-directory subcases, but it would be nice to have some
clarity about that.

There is at least one other bug in that function now that I look at it:
in event of a readdir() failure, it neglects to execute closedir().
Perhaps not too significant since all existing callers will just exit()
anyway after a failure, but still ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Josh Berkus
On 01/29/2015 09:11 AM, Bruce Momjian wrote:
 On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote:
 Then step 2 should specify that it's for the master.
 
 Right.  Josh is just listing all the steps --- the pg_upgrade docs
 already have that spelled out in detail.

What I'm also saying is that, if we expect anyone to be able to follow
all of these steps, it has to be very explicit; just saying Follow the
pg_upgrade docs but don't start the master yet isn't clear enough,
because the pg_upgrade docs have a few alternative paths.

On  the whole, I personally would never follow this procedure at a
production site.  It's way too fragile and easy to screw up.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] File based Incremental backup v8

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 9:47 AM, Marco Nenciarini
marco.nenciar...@2ndquadrant.it wrote:
 The current implementation of copydir function is incompatible with LSN
 based incremental backups. The problem is that new files are created,
 but their blocks are still with the old LSN, so they will not be backed
 up because they are looking old enough.

I think this is trying to pollute what's supposed to be a pure
fs-level operation (copy a directory) into something that is aware
of specific details like the PostgreSQL page format.  I really think
that nothing in storage/file should know about the page format.  If we
need a function that copies a file while replacing the LSNs, I think
it should be a new function living somewhere else.

A bigger problem is that you are proposing to stamp those files with
LSNs that are, for lack of a better word, fake.  I would expect that
this would completely break if checksums are enabled.  Also, unlogged
relations typically have an LSN of 0; this would change that in some
cases, and I don't know whether that's OK.

The issues here are similar to those in
http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de
- basically, I think we need to make CREATE DATABASE and ALTER
DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
never going to work right.  If we're not going to allow that, we need
to disallow hot backups while those operations are in progress.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GetLockConflicts() and thus recovery conflicts seem pretty broken

2015-01-29 Thread Simon Riggs
On 27 January 2015 at 14:27, Andres Freund and...@2ndquadrant.com wrote:

 While investigating other bugs I noticed that
 ResolveRecoveryConflictWithLock() wasn't really working. Turns out
 GetLockConflicts() violates it's API contract which says:

  * The result array is palloc'd and is terminated with an invalid VXID.

 Problem 1:
 We don't actually put the terminator there. It happens to more or less
 accidentally work on a master because the array is palloc0()ed there and
 while a 0 is valid backend id it happens to not be a valid local
 transaction id.

Yes, we should put the terminator there.

 In HS we don't actually allocate the array every time,
 but it's instead statically allocated. Without zeroing.

 Problem 2:
 Since bcd8528f001 and 29eedd312274 the the result array is palloc'd is
 wrong because we're now doing:

 static VirtualTransactionId *vxids;
 /*
  * Allocate memory to store results, and fill with InvalidVXID.  We 
 only
  * need enough space for MaxBackends + a terminator, since prepared 
 xacts
  * don't count. InHotStandby allocate once in TopMemoryContext.
  */
 if (InHotStandby)
 {
 if (vxids == NULL)
 vxids = (VirtualTransactionId *)
 MemoryContextAlloc(TopMemoryContext,

 sizeof(VirtualTransactionId) * (MaxBackends + 1));
 }
 else
 vxids = (VirtualTransactionId *)
 palloc0(sizeof(VirtualTransactionId) * (MaxBackends + 
 1));

 Obviously that violates the API contract. I'm inclined to rip the HS
 special case out and add a pfree() to the single HS caller.

Agreed. Removing special purpose code seems like a good idea.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] File based Incremental backup v8

2015-01-29 Thread Marco Nenciarini
The current implementation of copydir function is incompatible with LSN
based incremental backups. The problem is that new files are created,
but their blocks are still with the old LSN, so they will not be backed
up because they are looking old enough.

copydir function is used in:

  CREATE DATABASE
  ALTER DATABASE SET TABLESPACE

I can imagine two possible solutions:

a) wal log the whole copydir operations, setting the lsn accordingly
b) pass to copydir the LSN of the operation which triggered it, and
update the LSN of all the copied blocks

The latter solution is IMO easier to be implemented and does not deviate
much from the current implementation.

I've implemented it and it's attached to this message.

I've also moved the parse_filename_for_notntemp_relation function out of
reinit.c to make it available both to copydir.c and basebackup.c.

I've also limited the LSN comparison to the only MAIN fork, because:

* LSN fork doesn't uses LSN
* VM fork update LSN only when the visibility bit is set
* INIT forks doesn't use LSN. It's only one page anyway.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From 087faed899b9afab324aff7fa20e715c4f99eb4a Mon Sep 17 00:00:00 2001
From: Marco Nenciarini marco.nenciar...@2ndquadrant.it
Date: Thu, 29 Jan 2015 12:18:47 +0100
Subject: [PATCH 1/3] public parse_filename_for_nontemp_relation

---
 src/backend/storage/file/reinit.c | 58 ---
 src/common/relpath.c  | 56 +
 src/include/common/relpath.h  |  2 ++
 3 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/src/backend/storage/file/reinit.c 
b/src/backend/storage/file/reinit.c
index afd9255..02b5fee 100644
*** a/src/backend/storage/file/reinit.c
--- b/src/backend/storage/file/reinit.c
*** static void ResetUnloggedRelationsInTabl
*** 28,35 
  int 
op);
  static void ResetUnloggedRelationsInDbspaceDir(const char *dbspacedirname,
   int op);
- static bool parse_filename_for_nontemp_relation(const char *name,
-   int 
*oidchars, ForkNumber *fork);
  
  typedef struct
  {
--- 28,33 
*** ResetUnloggedRelationsInDbspaceDir(const
*** 388,446 
fsync_fname((char *) dbspacedirname, true);
}
  }
- 
- /*
-  * Basic parsing of putative relation filenames.
-  *
-  * This function returns true if the file appears to be in the correct format
-  * for a non-temporary relation and false otherwise.
-  *
-  * NB: If this function returns true, the caller is entitled to assume that
-  * *oidchars has been set to the a value no more than OIDCHARS, and thus
-  * that a buffer of OIDCHARS+1 characters is sufficient to hold the OID
-  * portion of the filename.  This is critical to protect against a possible
-  * buffer overrun.
-  */
- static bool
- parse_filename_for_nontemp_relation(const char *name, int *oidchars,
-   
ForkNumber *fork)
- {
-   int pos;
- 
-   /* Look for a non-empty string of digits (that isn't too long). */
-   for (pos = 0; isdigit((unsigned char) name[pos]); ++pos)
-   ;
-   if (pos == 0 || pos  OIDCHARS)
-   return false;
-   *oidchars = pos;
- 
-   /* Check for a fork name. */
-   if (name[pos] != '_')
-   *fork = MAIN_FORKNUM;
-   else
-   {
-   int forkchar;
- 
-   forkchar = forkname_chars(name[pos + 1], fork);
-   if (forkchar = 0)
-   return false;
-   pos += forkchar + 1;
-   }
- 
-   /* Check for a segment number. */
-   if (name[pos] == '.')
-   {
-   int segchar;
- 
-   for (segchar = 1; isdigit((unsigned char) name[pos + segchar]); 
++segchar)
-   ;
-   if (segchar = 1)
-   return false;
-   pos += segchar;
-   }
- 
-   /* Now we should be at the end. */
-   if (name[pos] != '\0')
-   return false;
-   return true;
- }
--- 386,388 
diff --git a/src/common/relpath.c b/src/common/relpath.c
index 66dfef1..83a1e3a 100644
*** a/src/common/relpath.c
--- b/src/common/relpath.c
*** GetRelationPath(Oid dbNode, Oid spcNode,
*** 206,208 
--- 206,264 
}
return path;
  }
+ 
+ /*
+  * Basic parsing of putative relation filenames.
+  *
+  * This function returns true if the file appears to be in the correct format
+  * for a non-temporary relation and false otherwise.
+  *
+  * NB: If this function returns true, the caller is 

[HACKERS] Memory leak in gingetbitmap

2015-01-29 Thread Heikki Linnakangas
While looking at the segfault that Olaf Gawenda reported (bug #12694), I 
realized that the GIN fast scan patch introduced a small memory leak to 
re-scanning a GIN index. In a nutshell, freeScanKeys() needs to pfree() 
the two new arrays, requiredEntries and additionalEntries.


After fixing that, I'm still seeing a small leak. I found that the 
queryCategories field also needs to be free'd, but it still leaks even 
after fixing that.


I think we need a more whole-sale approach. I'm thinking of adding a new 
memory context to contain everything related to the scan keys, which can 
then be destroyed in whole.


We haven't heard any complaints about this from users, but I think this 
deserves to be fixed. Perhaps not worth back-patching however.


PS. Here's what I'm using to test this:

create extension btree_gin;
create table a (t text);
create table b (t text);
insert into b values ('foo'), ('bar');
insert into a select 'x'||g from generate_series(1, 200) g;
create index i_b On b using gin (t) ;
set enable_hashjoin=off;
set enable_material=off;
set enable_seqscan=off;
set enable_mergejoin=off;
set enable_indexscan=off;

select * from a, b where a.t = b.t;

It doesn't leak if the index is a regular b-tree index.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump with both --serializable-deferrable and -j

2015-01-29 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 I propose to apply the attached to master and back-patch to 9.3, 
 and follow that with a patch (for master only) along the lines 
 suggested by Andres.  Since *that* change is more invasive and 
 changes existing behavior I will submit it to the open CF for 
 review.  Objections?

Only the nit-picky one that I quite dislike putting a comment block inside
an if-condition like that.  It's not really house style around here,
and in particular I suspect pgindent might not treat it nicely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby WAL reply uses heavyweight session locks, but doesn't have enough infrastructure set up

2015-01-29 Thread Robert Haas
On Wed, Jan 28, 2015 at 2:41 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Andres Freund wrote:
 I think this isn't particularly pretty, but it seems to be working well
 enough, and changing it would be pretty invasive. So keeping in line
 with all that code seems to be easier.
 OK, I'm convinced with this part to remove the call of
 LockSharedObjectForSession that uses dontWait and replace it by a loop
 in ResolveRecoveryConflictWithDatabase.

That seems right to me, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump with both --serializable-deferrable and -j

2015-01-29 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 A comment seems essential here, because as written anybody would 
 think the test for a snapshot is a bug.

 Good point.

I propose to apply the attached to master and back-patch to 9.3, 
and follow that with a patch (for master only) along the lines 
suggested by Andres.  Since *that* change is more invasive and 
changes existing behavior I will submit it to the open CF for 
review.  Objections?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index c3ebb3a..6d57fe0 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1011,7 +1011,17 @@ setup_connection(Archive *AH, DumpOptions *dopt, const char *dumpencoding,
 	ExecuteSqlStatement(AH, BEGIN);
 	if (AH-remoteVersion = 90100)
 	{
-		if (dopt-serializable_deferrable)
+		if (dopt-serializable_deferrable 
+			/*
+			 * To support the combination of this option with the jobs option
+			 * we use REPEATABLE READ for the worker connections that are
+			 * passed a snapshot.  As long as the snapshot is acquired in a
+			 * SERIALIZABLE, READ ONLY, DEFERRABLE transaction, its use within
+			 * a REPEATABLE READ transaction provides the appropriate
+			 * integrity guarantees.  This is a kluge, but safe for
+			 * back-patching.
+			 */
+			AH-sync_snapshot_id == NULL)
 			ExecuteSqlStatement(AH,
 SET TRANSACTION ISOLATION LEVEL 
 SERIALIZABLE, READ ONLY, DEFERRABLE);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_check_dir comments and implementation mismatch

2015-01-29 Thread Marco Nenciarini
Hi,

reading pgcheckdir.c code I noticed that the function comment
was outdated. The code now can return values from 0 to 4 while the
comment explains only values 0,1,2.

Patch attached.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
From 44623f67a124c4c77f7ff8097f13e116d20d83a5 Mon Sep 17 00:00:00 2001
From: Marco Nenciarini marco.nenciar...@2ndquadrant.it
Date: Thu, 29 Jan 2015 16:45:27 +0100
Subject: [PATCH] Update pg_check_dir comment

---
 src/port/pgcheckdir.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/port/pgcheckdir.c b/src/port/pgcheckdir.c
index 7061893..9165ebb 100644
*** a/src/port/pgcheckdir.c
--- b/src/port/pgcheckdir.c
***
*** 22,28 
   * Returns:
   *0 if nonexistent
   *1 if exists and empty
!  *2 if exists and not empty
   *-1 if trouble accessing directory (errno reflects the error)
   */
  int
--- 22,30 
   * Returns:
   *0 if nonexistent
   *1 if exists and empty
!  *2 if exists and contains _only_ dot files
!  *3 if exists and contains a mount point
!  *4 if exists and not empty
   *-1 if trouble accessing directory (errno reflects the error)
   */
  int
*** pg_check_dir(const char *dir)
*** 32,37 
--- 34,40 
DIR*chkdir;
struct dirent *file;
booldot_found = false;
+   boolmount_found = false;
  
chkdir = opendir(dir);
if (chkdir == NULL)
*** pg_check_dir(const char *dir)
*** 51,60 
{
dot_found = true;
}
else if (strcmp(lost+found, file-d_name) == 0)
{
!   result = 3; /* not empty, mount 
point */
!   break;
}
  #endif
else
--- 54,63 
{
dot_found = true;
}
+   /* lost+found directory */
else if (strcmp(lost+found, file-d_name) == 0)
{
!   mount_found = true;
}
  #endif
else
*** pg_check_dir(const char *dir)
*** 67,72 
--- 70,79 
if (errno || closedir(chkdir))
result = -1;/* some kind of I/O error? */
  
+   /* We report on mount point if we find a lost+found directory */
+   if (result == 1  mount_found)
+   result = 3;
+ 
/* We report on dot-files if we _only_ find dot files */
if (result == 1  dot_found)
result = 2;
-- 
2.2.2



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] TABLESAMPLE patch

2015-01-29 Thread Robert Haas
On Wed, Jan 28, 2015 at 5:19 AM, Petr Jelinek p...@2ndquadrant.com wrote:
 Yes, that's my view too. I would generally be for that change also and it
 would be worth it if the code was used in more than one place, but as it is
 it seems like it will just add code/complexity for no real benefit. It would
 make sense in case we used sequential scan node instead of the new node as
 Amit also suggested, but I remain unconvinced that mixing sampling and
 sequential scan into single scan node would be a good idea.

Based on previous experience, I expect that any proposal to merge
those nodes would get shot down by Tom with his laser-guided atomic
bazooka faster than you can say -1 from me regards tom lane.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2015-01-29 Thread Jim Nasby

On 1/29/15 6:19 AM, Fabrízio de Royes Mello wrote:

On Wed, Jan 28, 2015 at 5:27 PM, Tom Lane t...@sss.pgh.pa.us 
mailto:t...@sss.pgh.pa.us wrote:
 
  =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com 
mailto:fabriziome...@gmail.com writes:
   But I'm thinking about this patch and would not be interesting to have a
   FDW to manipulate the hba file? Imagine if we are able to manipulate the
   HBA file using INSERT/UPDATE/DELETE.
 
  Since the HBA file is fundamentally order-dependent, while SQL tables
  are fundamentally not, that doesn't seem like a great API match.  You
  could probably brute-force something that would work, but it would very
  much be a case of using a hammer to solve a screwdriver problem.
 

Maybe, but my intention is provide an easy way to edit HBA entries. With an 
extension or API to edit HBA entries many developers of PostgreSQL tools (ie. 
pgadmin, phppgadmin, etc) will be benefited.

Perhaps a fdw can't be the best choice, maybe a complete new SQL syntax to 
manipulate HBA entries like we did with ALTER SYSTEM. It's just some thoughts 
about it.


Aside from Tom's concern about sets not being a good way to handle this (which I agree 
with), the idea of editing pg_hba.conf via SQL raises all the problems that 
were brought up when ALTER SYSTEM was being developed. One of the big problems is a 
question of how you can safely modify a text file that's full of comments and what-not. 
You'd need to address those issues if you hope to modify pg_hba.conf via SQL.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 4:09 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 The difference between the autovacuum-run vacuum and the cron-run vacuum
 is that the one running out of cron will just keep holding the lock
 until it's actually able to truncate the end of the relation, no?  I
 recall discussion previously that we need a way to either support that
 in autovacuum for (a configurable set of) regular relations or come up
 with a solution that doesn't require that lock.

 AFAICT, in master, there is no difference in truncation between auto and
 manual vacuum. What we do is attempt to acquire the truncation lock for up
 to 5 seconds, giving up after that. Once we do have the lock, we check to
 see how many pages we can actually truncate. During that check, we test
 every ~20ms or so to see if someone else is waiting on our exclusive lock;
 if they are we stop counting and will only truncate the relation up to that
 point.

I don't think this is true, and I don't think it's been true for a
long time, if ever.  The difference between a manual vacuum and
autovacuum is that autovacuum commits suicide when it conflicts with
somebody else's lock request, and a manual vacuum doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Andrew Dunstan


On 01/29/2015 04:59 PM, Robert Haas wrote:

On Thu, Jan 29, 2015 at 4:33 PM, Andrew Dunstan and...@dunslane.net wrote:

On 01/29/2015 12:10 PM, Robert Haas wrote:

On Wed, Jan 28, 2015 at 12:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:

The cause of this bug is commit 0ad1a816320a2b539a51628e2a0b1e83ff096b1d,
which I'm inclined to think we need to simply revert, not render even
more squirrely.

Yes, that commit looks broken.  If you convert from text to JSON you
should get a JSON string equivalent to the text you started out with.
That commit departs from that in favor of allowing \u sequences in
the text being converted to turn into a single character (or perhaps
an encoding error) after a text-json-text roundtrip.  Maybe I
haven't had enough caffeine today, but I can't see how that can
possibly be a good idea.

Possibly.

I'm coming down more and more on the side of Tom's suggestion just to ban
\u in jsonb. I think that would let us have some fairly simple and
consistent rules. I'm not too worried that we'll be disallowing input that
we've previously allowed. We've done that often in the past, although less
often in point releases. I certainly don't want to wait a full release cycle
to fix this if possible.

I have yet to understand what we fix by banning \u.  How is 
different from any other four-digit hexadecimal number that's not a
valid character in the current encoding?  What does banning that one
particular value do?

In any case, whatever we do about that issue, the idea that the text
- json string transformation can *change the input string into some
other string* seems like an independent problem.



jsonb stores string values as postgres text values, with the unicode 
escapes resolved, just as it also resolves numbers and booleans into 
their native representation.  If you want the input perfectly preserved, 
use json, not jsonb.  I think that's made pretty clear in the docs.


so text-jsonb-text is not and has never been expected to be a noop.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Small bug on CREATE EXTENSION pgq...

2015-01-29 Thread Jim Nasby

On 1/28/15 10:25 PM, Tom Lane wrote:

Stephen Frost sfr...@snowman.net writes:

* David Johnston (david.g.johns...@gmail.com) wrote:

Fair enough but reset to what?  I don't know the internal mechanics but
if the session default is warning and a local change sets it to notice
then an unconditional reset would not get us back to the intended value.



Yeah, we'd really want to reset it to what it was before.


An extension script runs as a single transaction, so SET LOCAL could've
been used to accomplish the result without trashing the session-lifespan
setting.

I'm not sure whether or not there was good reason to be changing the
setting at all, but it's entirely on the extension script's head that
it didn't do this in a less invasive way.


+1

One thing I have wished for is something akin to SET LOCAL that reverts at the 
end of a subtransaction. Typically I only need to tweak something like 
client_min_messages for a single command, so I'd prefer to

SAVEPOINT old_setting;
SET LOCAL blah;
command;
RELEASE old_setting;
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix column-privilege leak in error-message paths

2015-01-29 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 Fix column-privilege leak in error-message paths

This patch is at least one brick shy of a load:

regression=# create table t1 (f1 int);
CREATE TABLE
regression=# create unique index on t1 (abs(f1));
CREATE INDEX
regression=# create user joe; 
CREATE ROLE
regression=# grant insert on t1 to joe;
GRANT
regression=# \c - joe
You are now connected to database regression as user joe.
regression= insert into t1 values (1);
INSERT 0 1
regression= insert into t1 values (1);
ERROR:  attribute 0 of relation with OID 45155 does not exist

The cause of that is the logic added to BuildIndexValueDescription, which
ignores the possibility that some of the index columns are expressions
(which will have a zero in indkey[]).

I'm not sure that it's worth trying to drill down and determine exactly
which column(s) are referenced by an expression.  I'd be content if we
just decided that any index expression is off-limits to someone without
full SELECT access, which could be achieved with something like

   {
   AttrNumber  attnum = idxrec-indkey.values[keyno];

-   aclresult = pg_attribute_aclcheck(indrelid, attnum, GetUserId(),
- ACL_SELECT);
-
-   if (aclresult != ACLCHECK_OK)
+   if (attnum == InvalidAttrNumber ||
+   pg_attribute_aclcheck(indrelid, attnum, GetUserId(),
+ ACL_SELECT) != ACLCHECK_OK)
   {
   /* No access, so clean up and return */
   ReleaseSysCache(ht_idx);

(though a comment about it wouldn't be a bad thing either)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix column-privilege leak in error-message paths

2015-01-29 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  Fix column-privilege leak in error-message paths
 
 This patch is at least one brick shy of a load:
 
 regression=# create table t1 (f1 int);
 CREATE TABLE
 regression=# create unique index on t1 (abs(f1));
 CREATE INDEX
 regression=# create user joe; 
 CREATE ROLE
 regression=# grant insert on t1 to joe;
 GRANT
 regression=# \c - joe
 You are now connected to database regression as user joe.
 regression= insert into t1 values (1);
 INSERT 0 1
 regression= insert into t1 values (1);
 ERROR:  attribute 0 of relation with OID 45155 does not exist

Urgh.

 The cause of that is the logic added to BuildIndexValueDescription, which
 ignores the possibility that some of the index columns are expressions
 (which will have a zero in indkey[]).

Ah, yes, I didn't expect that, clearly.

 I'm not sure that it's worth trying to drill down and determine exactly
 which column(s) are referenced by an expression.  I'd be content if we
 just decided that any index expression is off-limits to someone without
 full SELECT access, which could be achieved with something like

Sounds perfectly reasonable to me.

{
AttrNumber  attnum = idxrec-indkey.values[keyno];
 
 -   aclresult = pg_attribute_aclcheck(indrelid, attnum, GetUserId(),
 - ACL_SELECT);
 -
 -   if (aclresult != ACLCHECK_OK)
 +   if (attnum == InvalidAttrNumber ||
 + pg_attribute_aclcheck(indrelid, attnum, GetUserId(),
 + ACL_SELECT) != ACLCHECK_OK)
{
/* No access, so clean up and return */
ReleaseSysCache(ht_idx);

Yup, that looks good to me.

 (though a comment about it wouldn't be a bad thing either)

Agreed.

Will handle shortly.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 4:33 PM, Andrew Dunstan and...@dunslane.net wrote:
 On 01/29/2015 12:10 PM, Robert Haas wrote:
 On Wed, Jan 28, 2015 at 12:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The cause of this bug is commit 0ad1a816320a2b539a51628e2a0b1e83ff096b1d,
 which I'm inclined to think we need to simply revert, not render even
 more squirrely.
 Yes, that commit looks broken.  If you convert from text to JSON you
 should get a JSON string equivalent to the text you started out with.
 That commit departs from that in favor of allowing \u sequences in
 the text being converted to turn into a single character (or perhaps
 an encoding error) after a text-json-text roundtrip.  Maybe I
 haven't had enough caffeine today, but I can't see how that can
 possibly be a good idea.

 Possibly.

 I'm coming down more and more on the side of Tom's suggestion just to ban
 \u in jsonb. I think that would let us have some fairly simple and
 consistent rules. I'm not too worried that we'll be disallowing input that
 we've previously allowed. We've done that often in the past, although less
 often in point releases. I certainly don't want to wait a full release cycle
 to fix this if possible.

I have yet to understand what we fix by banning \u.  How is 
different from any other four-digit hexadecimal number that's not a
valid character in the current encoding?  What does banning that one
particular value do?

In any case, whatever we do about that issue, the idea that the text
- json string transformation can *change the input string into some
other string* seems like an independent problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Andrew Dunstan


On 01/29/2015 12:10 PM, Robert Haas wrote:

On Wed, Jan 28, 2015 at 12:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:

The cause of this bug is commit 0ad1a816320a2b539a51628e2a0b1e83ff096b1d,
which I'm inclined to think we need to simply revert, not render even
more squirrely.

Yes, that commit looks broken.  If you convert from text to JSON you
should get a JSON string equivalent to the text you started out with.
That commit departs from that in favor of allowing \u sequences in
the text being converted to turn into a single character (or perhaps
an encoding error) after a text-json-text roundtrip.  Maybe I
haven't had enough caffeine today, but I can't see how that can
possibly be a good idea.




Possibly.

I'm coming down more and more on the side of Tom's suggestion just to 
ban \u in jsonb. I think that would let us have some fairly simple 
and consistent rules. I'm not too worried that we'll be disallowing 
input that we've previously allowed. We've done that often in the past, 
although less often in point releases. I certainly don't want to wait a 
full release cycle to fix this if possible.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Andres Freund
On 2015-01-29 16:33:36 -0500, Andrew Dunstan wrote:
 
 On 01/29/2015 12:10 PM, Robert Haas wrote:
 On Wed, Jan 28, 2015 at 12:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The cause of this bug is commit 0ad1a816320a2b539a51628e2a0b1e83ff096b1d,
 which I'm inclined to think we need to simply revert, not render even
 more squirrely.
 Yes, that commit looks broken.  If you convert from text to JSON you
 should get a JSON string equivalent to the text you started out with.
 That commit departs from that in favor of allowing \u sequences in
 the text being converted to turn into a single character (or perhaps
 an encoding error) after a text-json-text roundtrip.  Maybe I
 haven't had enough caffeine today, but I can't see how that can
 possibly be a good idea.

 I'm coming down more and more on the side of Tom's suggestion just to ban
 \u in jsonb. I think that would let us have some fairly simple and
 consistent rules. I'm not too worried that we'll be disallowing input that
 we've previously allowed. We've done that often in the past, although less
 often in point releases. I certainly don't want to wait a full release cycle
 to fix this if possible.

I think waiting a full cycle would be likely to make things much worse,
given 9.4's current age.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2015-01-29 Thread Jim Nasby

On 1/28/15 4:25 PM, Tomas Vondra wrote:

+ * It's possible to choose whether to create a separate memory context for the
+ * array builde state, or whether to allocate it directly within rcontext


Typo.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Jim Nasby

On 1/28/15 7:45 PM, Stephen Frost wrote:

Jim,

* Jim Nasby (jim.na...@bluetreble.com) wrote:

On 12/23/14 12:52 PM, Stephen Frost wrote:

Autovacuum can certainly run vacuum/analyze on a few tables every 12
hours, so I'm not really following where you see autovacuum being unable
to cope.  I agree that there*are*  such cases, but getting more
information about those cases and exactly what solution*does*  work
would really help us improve autovacuum to address those use-cases.


(going through some old email...)

The two cases I've dealt with recently are:

- Tables with a fair update/delete rate that should always stay small

The problem with these tables is if anything happens to upset vacuuming you can 
end up with a significantly larger than expected table that's now essentially 
impossible to shrink. This could be caused by a single long-running transaction 
that happens to be in play when autovac kicks off, or for other reasons. Even 
once you manage to get all the tuples off the end of the heap it can still be 
extremely difficult to grab the lock you need to truncate it. Running a vacuum 
every minute from cron seems to help control this. Sadly, your indexes still 
get bloated, so occasionally you want to re-cluster too.


The difference between the autovacuum-run vacuum and the cron-run vacuum
is that the one running out of cron will just keep holding the lock
until it's actually able to truncate the end of the relation, no?  I
recall discussion previously that we need a way to either support that
in autovacuum for (a configurable set of) regular relations or come up
with a solution that doesn't require that lock.


AFAICT, in master, there is no difference in truncation between auto and manual 
vacuum. What we do is attempt to acquire the truncation lock for up to 5 
seconds, giving up after that. Once we do have the lock, we check to see how 
many pages we can actually truncate. During that check, we test every ~20ms or 
so to see if someone else is waiting on our exclusive lock; if they are we stop 
counting and will only truncate the relation up to that point.

So what this boils down to is that it's very hard to truncate a busy relation 
and your best bet of doing so is by repeatedly trying to.


- Preemptively vacuuming during off-hours

Many sites have either nightly or weekend periods of reduced load. Such sites 
can gain a great benefit from scheduling preemptive vacuums to reduce the odds 
of disruptive vacuuming activity during heavy activity periods. This is 
especially true when it comes to a scan_all vacuum of a large table; having 
autovac do one of those at a peak period can really hose things.


Having preferrable times for autovacuum to run vacuums would certainly
be nice to support this use-case.

All that said, I'm not against a role attribute which allows the user to
vacuum/analyze anything.  I do think that's a bit different from the
existing effort to reduce the activities which require superuser as with
the vacuum/analyze case you *could* have a single role that's a member
of every role that owns the relations which you want to vacuum/analyze.
I grant that it's a bit awkward though.


Yeah, I was mostly just providing some use cases. I'm not opposed to a separate 
vacuum/analyze permission, but don't see a huge need for it either. Typically I 
set this stuff up as a cron on the server itself, utilizing an account that 
does ident authentication. I figure if someone manages to compromise that then 
they probably have root on the box anyway, which is obviously game over.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] File based Incremental backup v8

2015-01-29 Thread Andres Freund
On 2015-01-29 12:57:22 -0500, Robert Haas wrote:
 The issues here are similar to those in
 http://www.postgresql.org/message-id/20150120152819.gc24...@alap3.anarazel.de
 - basically, I think we need to make CREATE DATABASE and ALTER
 DATABASE .. SET TABLESPACE fully WAL-logged operations, or this is
 never going to work right.  If we're not going to allow that, we need
 to disallow hot backups while those operations are in progress.

Yea, the current way is just a hack from the dark ages. Which has some
advantages, true, but I don't think they outweight the disadvantages. I
hope to find time to develop a patch to make those properly WAL logged
(for master) sometime not too far away.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Jan 29, 2015 at 4:33 PM, Andrew Dunstan and...@dunslane.net wrote:
 I'm coming down more and more on the side of Tom's suggestion just to ban
 \u in jsonb.

 I have yet to understand what we fix by banning \u.  How is 
 different from any other four-digit hexadecimal number that's not a
 valid character in the current encoding?  What does banning that one
 particular value do?

As Andrew pointed out upthread, it avoids having to answer the question of
what to return for

select (jsonb '[foo\ubar]')-0;

or any other construct which is supposed to return an *unescaped* text
representation of some JSON string value.

Right now you get

   ?column?   
--
 foo\ubar
(1 row)

Which is wrong IMO, first because it violates the premise that the output
should be unescaped, and second because this output cannot be
distinguished from the (correct) output of

regression=# select (jsonb '[foo\\ubar]')-0;
   ?column?   
--
 foo\ubar
(1 row)

There is no way to deliver an output that is not confusable with some
other value's correct output, other than by emitting a genuine \0 byte
which unfortunately we cannot support in a TEXT result.

Potential solutions for this have been mooted upthread, but none
of them look like they're something we can do in the very short run.
So the proposal is to ban \u until such time as we can do something
sane with it.

 In any case, whatever we do about that issue, the idea that the text
 - json string transformation can *change the input string into some
 other string* seems like an independent problem.

No, it's exactly the same problem, because the reason for that breakage
is an ill-advised attempt to make it safe to include \u in JSONB.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Misaligned BufferDescriptors causing major performance problems on AMD

2015-01-29 Thread Andres Freund
Pushed a version (hopefully) fixing Tom's complaints.

On 2015-01-28 13:52:30 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2015-01-28 13:38:51 -0500, Tom Lane wrote:
  #define BUFFERDESC_PADDED_SIZE (SIZEOF_VOID_P == 8 ? 64 : 32)
 
  Hm, did you intentionally put a 32in there or was that just the logical
  continuation of 64? Because there's no way it'll ever fit into 32 bytes
  in the near future. That's why I had put the sizeof(BufferDesc)
  there. We could just make it 1 as well, to indicate that we don't want
  any padding...
 
 Yeah, 1 would be fine too.  Maybe better to call it BUFFERDESC_MIN_SIZE,
 because as this stands it's enforcing a min size not exact size.

It's _PAD_TO_SIZE now.

 not going to whinge about that aspect of it; the main point here is to
 put in the union and fix the ensuing notational fallout.  We can worry
 about exactly what size to pad to as a separate discussion.)

Yea. The details can be changed in a concise way now. I hope ;)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread David Steele
On 1/29/15 12:42 PM, Josh Berkus wrote:
 On 01/29/2015 09:11 AM, Bruce Momjian wrote:
 On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote:
 Then step 2 should specify that it's for the master.
 Right.  Josh is just listing all the steps --- the pg_upgrade docs
 already have that spelled out in detail.
 What I'm also saying is that, if we expect anyone to be able to follow
 all of these steps, it has to be very explicit; just saying Follow the
 pg_upgrade docs but don't start the master yet isn't clear enough,
 because the pg_upgrade docs have a few alternative paths.

 On  the whole, I personally would never follow this procedure at a
 production site.  It's way too fragile and easy to screw up.

I'm in agreement with Josh - I would not use this method.  I may be
wrong, but it makes me extremely nervous.

I prefer to upgrade the primary and get it back up as soon as possible,
then take a backup and restore it to the replicas.  If the replicas are
being used for read-only queries instead of just redundancy then I
redirect that traffic to the primary while the replicas are being
upgraded and restored.  This method has the least downtime for the primary.

If you want less downtime overall then it's best to use the hot rsync /
cold rsync with checksums method, though this depends a lot on the size
of your database.

Ultimately, there is no single best method.  It depends a lot on your
environment.  I would prefer the official documents to contain very safe
methods.

-- 
- David Steele
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Exposing the stats snapshot timestamp to SQL

2015-01-29 Thread Matt Kelly
Robert, I'll add it to the commitfest.

Jim, I'm not sure I understand what you mean?  This new function follows
the same conventions as everything else in the file.  TimestampTz is just a
typedef for int64.  Functions like pg_stat_get_buf_alloc follow the exact
same pattern on the int64 fields of the global stats struct.

- Matt K.

On Thu, Jan 29, 2015 at 6:49 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 1/28/15 11:18 PM, Matt Kelly wrote:

 In a previous thread Tom Lane said:

 (I'm also wondering if it'd make sense to expose the stats timestamp
 as a callable function, so that the case could be dealt with
 programmatically as well.  But that's future-feature territory.)

 (http://www.postgresql.org/message-id/27251.1421684...@sss.pgh.pa.us)

 It seemed the appropriate scope for my first submission, and that feature
 has been on my wish list for a while, so I thought I'd grab it.


 I've reviewed the patch (though haven't tested it myself) and it looks
 good. The only thing I'm not sure of is this:

 + /* Get the timestamp of the current statistics snapshot */
 + Datum
 + pg_stat_snapshot_timestamp(PG_FUNCTION_ARGS)
 + {
 +   PG_RETURN_TIMESTAMPTZ(pgstat_fetch_global()-stats_timestamp);
 + }

 Is the community OK with referencing stats_timestamp that way?
 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Jim Nasby

On 1/29/15 7:02 PM, David Steele wrote:

On 1/29/15 7:55 PM, Jim Nasby wrote:

On 1/29/15 6:25 PM, David Steele wrote:

Safe backups can be done without LSNs provided you are willing to trust
your timestamps.


Which AFAICT simply isn't safe to do at all... except maybe with the
manifest stuff you've talked about?


Yes - that's what I'm talking about.  I had hoped to speak about this at
PgConfNYC, but perhaps I can do it in a lightning talk instead.


Sounds like maybe it should be part of our documentation too...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Jim Nasby

On 1/29/15 5:53 PM, David Steele wrote:

On 1/29/15 12:42 PM, Josh Berkus wrote:

On 01/29/2015 09:11 AM, Bruce Momjian wrote:

On Thu, Jan 29, 2015 at 12:09:58PM -0500, Andrew Dunstan wrote:

Then step 2 should specify that it's for the master.

Right.  Josh is just listing all the steps --- the pg_upgrade docs
already have that spelled out in detail.

What I'm also saying is that, if we expect anyone to be able to follow
all of these steps, it has to be very explicit; just saying Follow the
pg_upgrade docs but don't start the master yet isn't clear enough,
because the pg_upgrade docs have a few alternative paths.

On  the whole, I personally would never follow this procedure at a
production site.  It's way too fragile and easy to screw up.


I'm in agreement with Josh - I would not use this method.  I may be
wrong, but it makes me extremely nervous.

I prefer to upgrade the primary and get it back up as soon as possible,
then take a backup and restore it to the replicas.  If the replicas are
being used for read-only queries instead of just redundancy then I
redirect that traffic to the primary while the replicas are being
upgraded and restored.  This method has the least downtime for the primary.

If you want less downtime overall then it's best to use the hot rsync /
cold rsync with checksums method, though this depends a lot on the size
of your database.

Ultimately, there is no single best method.  It depends a lot on your
environment.  I would prefer the official documents to contain very safe
methods.


How do we define safe though? Your method leaves you without a backup server 
until your base backup completes and the replica catches up. I think we do a 
dis-service to our users by not pointing that out and providing a potential 
alternate *so long as we spell out the tradeoffs/risks*.

Ultimately, I think this thread really shows the very large need for a tool 
that understands things like LSNs to provide rsync-ish behavior that's actually 
safe.

FWIW, I personally am very leery of relying on pg_upgrade. It's too easy to 
introduce bugs, doesn't handle all cases, and provides no option for going back to 
your previous version without losing data. I much prefer old_version -- londiste 
-- new_version, and then doing the upgrade by reversing the direction of 
replication.

I also don't entirely trust PITR backups. It's too easy to accidentally break 
them in subtle ways.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the stats snapshot timestamp to SQL

2015-01-29 Thread Matt Kelly
This is now: https://commitfest.postgresql.org/4/128/

On Thu, Jan 29, 2015 at 7:01 PM, Matt Kelly mkell...@gmail.com wrote:

 Robert, I'll add it to the commitfest.

 Jim, I'm not sure I understand what you mean?  This new function follows
 the same conventions as everything else in the file.  TimestampTz is just a
 typedef for int64.  Functions like pg_stat_get_buf_alloc follow the exact
 same pattern on the int64 fields of the global stats struct.

 - Matt K.

 On Thu, Jan 29, 2015 at 6:49 PM, Jim Nasby jim.na...@bluetreble.com
 wrote:

 On 1/28/15 11:18 PM, Matt Kelly wrote:

 In a previous thread Tom Lane said:

 (I'm also wondering if it'd make sense to expose the stats timestamp
 as a callable function, so that the case could be dealt with
 programmatically as well.  But that's future-feature territory.)

 (http://www.postgresql.org/message-id/27251.1421684...@sss.pgh.pa.us)

 It seemed the appropriate scope for my first submission, and that
 feature has been on my wish list for a while, so I thought I'd grab it.


 I've reviewed the patch (though haven't tested it myself) and it looks
 good. The only thing I'm not sure of is this:

 + /* Get the timestamp of the current statistics snapshot */
 + Datum
 + pg_stat_snapshot_timestamp(PG_FUNCTION_ARGS)
 + {
 +   PG_RETURN_TIMESTAMPTZ(pgstat_fetch_global()-stats_timestamp);
 + }

 Is the community OK with referencing stats_timestamp that way?
 --
 Jim Nasby, Data Architect, Blue Treble Consulting
 Data in Trouble? Get it in Treble! http://BlueTreble.com





Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread Jim Nasby

On 1/29/15 6:25 PM, David Steele wrote:

Safe backups can be done without LSNs provided you are willing to trust
your timestamps.


Which AFAICT simply isn't safe to do at all... except maybe with the manifest 
stuff you've talked about?


FWIW, I personally am very leery of relying on pg_upgrade. It's too
easy to introduce bugs, doesn't handle all cases, and provides no
option for going back to your previous version without losing data. I
much prefer old_version -- londiste -- new_version, and then doing
the upgrade by reversing the direction of replication.

I think the official docs need to stick with options that are core?


I don't think we have any such requirement. IIRC the docs used to talk about 
using logical replication before we had pg_upgrade (and may have actually 
called out Slony).


I avoid pg_upgrade wherever it is practical.  However, sometimes it
really is the best option.


Certainly. I think what we should be doing is spelling out the available 
options (with pros/cons) so that users can decide what's best.


I also don't entirely trust PITR backups. It's too easy to
accidentally break them in subtle ways.

Agreed in general, but I've been doing a lot of work to make this not be
true anymore.


:)

I'd love to see all this stuff Just Work (tm), but I don't think we're there 
yet, and I'm not really sure how we can get there.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I have yet to understand what we fix by banning \u.  How is 
 different from any other four-digit hexadecimal number that's not a
 valid character in the current encoding?  What does banning that one
 particular value do?

BTW, as to the point about encoding violations: we *already* ban \u
sequences that don't correspond to valid characters in the current
encoding.  The attempt to exclude U+ from the set of banned characters
was ill-advised, plain and simple.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread David Steele
On 1/29/15 11:34 AM, Bruce Momjian wrote:

 3. Check that the replica is not very lagged.  If it is, wait for
 traffic to die down and for it to catch up.

I think I'd want a something a bit more specific here.  When the primary
shuts down it will kick out one last WAL.  The filename should be recorded.

 7. shut down postgres on the replica.

Before the shutdown make sure that the replicas are waiting on the
subsequent log file to appear (note that versions prior to 9.3 skip
00).  That means all WAL has been consumed and the primary and
replica(s) are in the same state.

This is a bit more complex if streaming replication is being used
*without* good old fashioned log shipping to a backup server and I'm not
sure exactly how to go about it.  I suppose you could start Postgres in
single user mode, commit a transaction, and make sure that transaction
gets to the replicas.

OTOH, streaming replication (unless it is synchronous) would be crazy
without doing WAL backup.  Maybe that's just me.

-- - David Steele da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread David Steele
On 1/29/15 7:55 PM, Jim Nasby wrote:
 On 1/29/15 6:25 PM, David Steele wrote:
 Safe backups can be done without LSNs provided you are willing to trust
 your timestamps.

 Which AFAICT simply isn't safe to do at all... except maybe with the
 manifest stuff you've talked about?

Yes - that's what I'm talking about.  I had hoped to speak about this at
PgConfNYC, but perhaps I can do it in a lightning talk instead.

-- 
- David Steele
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Jim Nasby

On 1/29/15 4:02 PM, Robert Haas wrote:

On Thu, Jan 29, 2015 at 4:09 PM, Jim Nasby jim.na...@bluetreble.com wrote:

The difference between the autovacuum-run vacuum and the cron-run vacuum
is that the one running out of cron will just keep holding the lock
until it's actually able to truncate the end of the relation, no?  I
recall discussion previously that we need a way to either support that
in autovacuum for (a configurable set of) regular relations or come up
with a solution that doesn't require that lock.


AFAICT, in master, there is no difference in truncation between auto and
manual vacuum. What we do is attempt to acquire the truncation lock for up
to 5 seconds, giving up after that. Once we do have the lock, we check to
see how many pages we can actually truncate. During that check, we test
every ~20ms or so to see if someone else is waiting on our exclusive lock;
if they are we stop counting and will only truncate the relation up to that
point.


I don't think this is true, and I don't think it's been true for a
long time, if ever.  The difference between a manual vacuum and
autovacuum is that autovacuum commits suicide when it conflicts with
somebody else's lock request, and a manual vacuum doesn't.


Any idea where we set that up? The call stack is (note I'm ignoring vacuum full) 
autovacuum_do_vac_analyze() - vacuum() - vacuum_rel() - lazy_vacuum_rel() - 
lazy_truncate_heap() (which also calls count_nondeletable_pages()), and I don't see any 
IsAutoVacuumWorkerProcess() calls in lazy_truncate_heap or count_nondeletable_pages(). So 
AFAICT truncation operates the same regardless of how the vacuum was started. We also 
explicitly set things like statement_timeout to 0 in autovac, so I don't think that would 
be responsible for this...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: two new role attributes and/or capabilities?

2015-01-29 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 On 1/29/15 4:02 PM, Robert Haas wrote:
 I don't think this is true, and I don't think it's been true for a
 long time, if ever.  The difference between a manual vacuum and
 autovacuum is that autovacuum commits suicide when it conflicts with
 somebody else's lock request, and a manual vacuum doesn't.

 Any idea where we set that up?

The relevant logic is in the deadlock detector, which will cancel an
autovacuum transaction if it is blocking somebody else's lock request.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the stats snapshot timestamp to SQL

2015-01-29 Thread Tom Lane
Matt Kelly mkell...@gmail.com writes:
 Jim, I'm not sure I understand what you mean?  This new function follows
 the same conventions as everything else in the file.  TimestampTz is just a
 typedef for int64.

... or double.  Have you checked that the code behaves properly with
--disable-integer-timestamps?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-29 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Jan 28, 2015 at 10:45 PM, Stephen Frost sfr...@snowman.net wrote:
  I agree, especially after going back and re-reading this while fixing
  the issue mentioned earlier by Peter (which was an orthogonal complaint
  about the shadowing of WITH CHECK by USING, if WITH CHECK isn't
  specified).  We really need a paragraph on USING policies and another
  on WITH CHECK policies.  How about a reword along these lines:
 
When row level security is enabled on a table, all access to that
table by users other than the owner or a superuser must be through a
policy.  This requirement applies to both selecting out existing rows
from the table and to adding rows to the table (through either INSERT
or UPDATE).
 
Granting access to existing rows in a table is done by specifying a
USING expression which will be added to queries which reference the
table.  Every row in the table which a USING expression returns true
will be visible.
 
Granting access to add rows to a table is done by specifying a WITH
CHECK expressison.  A WITH CHECK expression must return true for
every row being added to the table or an error will be returned and
the command will be aborted.
 
 I hate to be a critic, but this existing sentence in the documentation
 seems to explain nearly all of the above: A policy limits the ability
 to SELECT, INSERT, UPDATE, or DELETE rows in a table to those rows
 which match the relevant policy expression. Existing table rows are
 checked against the expression specified via USING, while new rows
 that would be created via INSERT or UPDATE are checked against the
 expression specified via WITH CHECK.  The only thing I can see we
 might need to add is a sentence that says something like If the USING
 clause returns false for a particular row, that row will not be
 visible to the user; if a WITH CHECK expression does not return true,
 an error occurs.

Well, I agree (I suppose perhaps that I have to, since I'm pretty sure I
wrote that.. :D), but what Dean was suggesting was a reword that
approached it from the other direction- that is, talk about policies as
granting access, instead of limiting it.  I thought that was an
excellent approach which will hopefully reduce confusion.  To that end,
how about this reword:

-
A policy grants the ability to SELECT, INSERT, UPDATE, or DELETE rows
which match the relevant policy expression. Existing table rows are
checked against the expression specified via USING, while new rows
that would be created via INSERT or UPDATE are checked against the
expression specified via WITH CHECK.  When a USING expression returns
false for a given row, that row is not visible to the user.  When a WITH
CHECK expression returns false for a row which is to be added, an error
occurs.
-

This would need to be after we talk about row level security being
enabled for a table and the default-deny policy or it might not make
sense, but otherwise I'm thinking it works.

Thoughts?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Error check always bypassed in tablefunc.c

2015-01-29 Thread Michael Paquier
On Fri, Jan 30, 2015 at 10:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 [blah]
 What I did about this was to leave the behavior alone in back branches,
 but insist on a type match in HEAD.  I think we can reasonably tighten
 the type requirements in a new major release, but doing it in minor
 releases is probably a bad idea.
Hm. OK. I am fine with that for the back branches. Thanks for
tightening things on master as well.

 * I thought it was odd to throw an error for NULL input, especially
 an infinite recursion error.  However, even with your patch the code
 would've dumped core on a null current_key value (since it would've
 passed a null start_with down to the next recursion level).  What I
 changed it to was to omit the recursion test (and hence print the row)
 and then not recurse at a null.  This seems consistent and reasonable.
This sounds reasonable to me as well.

 * I made a few other minor cleanups as well, in particular getting
 rid of some unnecessary pstrdup() steps.
Thanks!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Exposing the stats snapshot timestamp to SQL

2015-01-29 Thread Jim Nasby

On 1/28/15 11:18 PM, Matt Kelly wrote:

In a previous thread Tom Lane said:

(I'm also wondering if it'd make sense to expose the stats timestamp
as a callable function, so that the case could be dealt with
programmatically as well.  But that's future-feature territory.)

(http://www.postgresql.org/message-id/27251.1421684...@sss.pgh.pa.us)

It seemed the appropriate scope for my first submission, and that feature has 
been on my wish list for a while, so I thought I'd grab it.


I've reviewed the patch (though haven't tested it myself) and it looks good. 
The only thing I'm not sure of is this:

+ /* Get the timestamp of the current statistics snapshot */
+ Datum
+ pg_stat_snapshot_timestamp(PG_FUNCTION_ARGS)
+ {
+   PG_RETURN_TIMESTAMPTZ(pgstat_fetch_global()-stats_timestamp);
+ }

Is the community OK with referencing stats_timestamp that way?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread David Steele
On 1/29/15 10:13 AM, Bruce Momjian wrote:
 Agreed.  I have update the two mentions of rsync in our docs to clarify
 this.  Thank you.

 The patch also has pg_upgrade doc improvements suggested by comments
 from Josh Berkus.

It's very good to see this.  Mentions of this rsync vulnerability are
few and far between.

-- 
- David Steele
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Parallel Seq Scan

2015-01-29 Thread Jim Nasby

On 1/28/15 7:27 PM, Stephen Frost wrote:

* Jim Nasby (jim.na...@bluetreble.com) wrote:

On 1/28/15 9:56 AM, Stephen Frost wrote:

 Such i/o systems do exist, but a single RAID5 group over spinning rust
 with a simple filter isn't going to cut it with a modern CPU- we're just
 too darn efficient to end up i/o bound in that case.  A more complex
 filter might be able to change it over to being more CPU bound than i/o
 bound and produce the performance improvments you're looking for.


Except we're nowhere near being IO efficient. The vast difference between 
Postgres IO rates and dd shows this. I suspect that's because we're not giving the 
OS a list of IO to perform while we're doing our thing, but that's just a guess.

Uh, huh?  The dd was ~321000 and the slowest uncached PG run from
Robert's latest tests was 337312.554, based on my inbox history at
least.  I don't consider ~4-5% difference to be vast.


Sorry, I was speaking more generally than this specific test. In the past I've 
definitely seen SeqScan performance that was an order of magnitude slower than 
what dd would do. This was an older version of Postgres and an older version of 
linux, running on an iSCSI SAN. My suspicion is that the added IO latency 
imposed by iSCSI is what was causing this, but that's just conjecture.

I think Robert was saying that he hasn't been able to see this effect on their 
test server... that makes me think it's doing read-ahead on the OS level. But I 
suspect it's pretty touch and go to rely on that; I'd prefer we have some way 
to explicitly get that behavior where we want it.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread David Steele
On 1/29/15 7:07 PM, Jim Nasby wrote:
 Ultimately, there is no single best method.  It depends a lot on your
 environment.  I would prefer the official documents to contain very safe
 methods.

 How do we define safe though? Your method leaves you without a backup
 server until your base backup completes and the replica catches up. I
 think we do a dis-service to our users by not pointing that out and
 providing a potential alternate *so long as we spell out the
 tradeoffs/risks*.

My method leaves you without a replica, but not without a *backup* as
long as you are shipping WAL somewhere safe.  You can set
archive_timeout to something small if you want to make this safer.  This
is more practical in 9.4 since unused WAL space is zeroed.

OK, I'm willing to admit it would be better to have the option with all
caveats, so long as they are strongly worded.

 Ultimately, I think this thread really shows the very large need for a
 tool that understands things like LSNs to provide rsync-ish behavior
 that's actually safe.

Safe backups can be done without LSNs provided you are willing to trust
your timestamps.

 FWIW, I personally am very leery of relying on pg_upgrade. It's too
 easy to introduce bugs, doesn't handle all cases, and provides no
 option for going back to your previous version without losing data. I
 much prefer old_version -- londiste -- new_version, and then doing
 the upgrade by reversing the direction of replication.

I think the official docs need to stick with options that are core?

I avoid pg_upgrade wherever it is practical.  However, sometimes it
really is the best option.

 I also don't entirely trust PITR backups. It's too easy to
 accidentally break them in subtle ways.

Agreed in general, but I've been doing a lot of work to make this not be
true anymore.

-- 
- David Steele
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-29 Thread Robert Haas
On Thu, Jan 29, 2015 at 9:04 PM, Stephen Frost sfr...@snowman.net wrote:
 A policy grants the ability to SELECT, INSERT, UPDATE, or DELETE rows
 which match the relevant policy expression. Existing table rows are
 checked against the expression specified via USING, while new rows
 that would be created via INSERT or UPDATE are checked against the
 expression specified via WITH CHECK.  When a USING expression returns
 false for a given row, that row is not visible to the user.  When a WITH
 CHECK expression returns false for a row which is to be added, an error
 occurs.

Yeah, that's not bad.  I think it's an improvement, in fact.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_check_dir comments and implementation mismatch

2015-01-29 Thread Michael Paquier
On Fri, Jan 30, 2015 at 2:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 There is at least one other bug in that function now that I look at it:
 in event of a readdir() failure, it neglects to execute closedir().
 Perhaps not too significant since all existing callers will just exit()
 anyway after a failure, but still ...
I would imagine that code scanners like coverity or similar would not
be happy about that. ISTM that it is better to closedir()
appropriately in all the code paths.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Tom Lane
Attached is a draft patch for this.  It basically reverts commit
0ad1a816320a2b539a51628e2a0b1e83ff096b1d, adds a ban of \u if
that would need to be converted to text (so it still works in the
plain json type, so long as you don't do much processing), and adds
some regression tests.

I made the \u error be errcode(ERRCODE_INVALID_TEXT_REPRESENTATION)
and errmsg(invalid input syntax for type json), by analogy to what's
thrown for non-ASCII Unicode escapes in non-UTF8 encoding.  I'm not
terribly happy with that, though.  ISTM that for both cases, this is
not invalid syntax at all, but an implementation restriction that
forces us to reject perfectly valid syntax.  So I think we ought to
use a different ERRCODE and text message, though I'm not entirely
sure what it should be instead.  ERRCODE_FEATURE_NOT_SUPPORTED is
one possibility.

regards, tom lane

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 8feb2fb..b4b97a7 100644
*** a/doc/src/sgml/json.sgml
--- b/doc/src/sgml/json.sgml
***
*** 69,80 
regardless of the database encoding, and are checked only for syntactic
correctness (that is, that four hex digits follow literal\u/).
However, the input function for typejsonb/ is stricter: it disallows
!   Unicode escapes for non-ASCII characters (those
!   above literalU+007F/) unless the database encoding is UTF8.  It also
!   insists that any use of Unicode surrogate pairs to designate characters
!   outside the Unicode Basic Multilingual Plane be correct.  Valid Unicode
!   escapes, except for literal\u/, are then converted to the
!   equivalent ASCII or UTF8 character for storage.
   /para
  
   note
--- 69,82 
regardless of the database encoding, and are checked only for syntactic
correctness (that is, that four hex digits follow literal\u/).
However, the input function for typejsonb/ is stricter: it disallows
!   Unicode escapes for non-ASCII characters (those above literalU+007F/)
!   unless the database encoding is UTF8.  The typejsonb/ type also
!   rejects literal\u/ (because that cannot be represented in
!   productnamePostgreSQL/productname's typetext/ type), and it insists
!   that any use of Unicode surrogate pairs to designate characters outside
!   the Unicode Basic Multilingual Plane be correct.  Valid Unicode escapes
!   are converted to the equivalent ASCII or UTF8 character for storage;
!   this includes folding surrogate pairs into a single character.
   /para
  
   note
***
*** 134,140 
 row
  entrytypestring//entry
  entrytypetext//entry
! entrySee notes above concerning encoding restrictions/entry
 /row
 row
  entrytypenumber//entry
--- 136,143 
 row
  entrytypestring//entry
  entrytypetext//entry
! entryliteral\u/ is disallowed, as are non-ASCII Unicode
!  escapes if database encoding is not UTF8/entry
 /row
 row
  entrytypenumber//entry
diff --git a/doc/src/sgml/release-9.4.sgml b/doc/src/sgml/release-9.4.sgml
index 961e461..11bbf3b 100644
*** a/doc/src/sgml/release-9.4.sgml
--- b/doc/src/sgml/release-9.4.sgml
***
*** 103,124 
  
  listitem
   para
-   Unicode escapes in link linkend=datatype-jsontypeJSON/type/link
-   text values are no longer rendered with the backslash escaped
-   (Andrew Dunstan)
-  /para
- 
-  para
-   Previously, all backslashes in text values being formed into JSON
-   were escaped. Now a backslash followed by literalu/ and four
-   hexadecimal digits is not escaped, as this is a legal sequence in a
-   JSON string value, and escaping the backslash led to some perverse
-   results.
-  /para
- /listitem
- 
- listitem
-  para
When converting values of type typedate/, typetimestamp/
or typetimestamptz/
to link linkend=datatype-jsontypeJSON/type/link, render the
--- 103,108 
diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index 3c137ea..4e46b0a 100644
*** a/src/backend/utils/adt/json.c
--- b/src/backend/utils/adt/json.c
*** json_lex_string(JsonLexContext *lex)
*** 806,819 
  	 * For UTF8, replace the escape sequence by the actual
  	 * utf8 character in lex-strval. Do this also for other
  	 * encodings if the escape designates an ASCII character,
! 	 * otherwise raise an error. We don't ever unescape a
! 	 * \u, since that would result in an impermissible nul
! 	 * byte.
  	 */
  
  	if (ch == 0)
  	{
! 		appendStringInfoString(lex-strval, \\u);
  	}
  	else if (GetDatabaseEncoding() == PG_UTF8)
  	{
--- 806,822 
  	 * For UTF8, replace the escape sequence by the actual
  	 * utf8 character in lex-strval. Do this also for other
  	 * encodings if the escape designates an ASCII character,
! 	 * 

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2015-01-29 Thread Alvaro Herrera
Pavel Stehule wrote:

 should not be used a pessimist controlled locking instead?
 

Patches welcome.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Reducing lock strength of trigger and foreign key DDL

2015-01-29 Thread Michael Paquier
On Fri, Jan 23, 2015 at 8:55 AM, Andreas Karlsson andr...@proxel.se wrote:
 On 01/22/2015 10:31 PM, Andreas Karlsson wrote:

 I agree with this view, and am not sure myself that it is worth lowering
 the lock level of ALTER TRIGGER RENAME. I have attached a patch without
 the changes to ALTER TRIGGER and ruleutils.c and also fixes the comment
 issues noted by Andres.


 Whops, forgot to include the isolation tests.
Ok, so the deal is to finally reduce the locks to
ShareRowExclusiveLock for the following commands :
- CREATE TRIGGER
- ALTER TABLE ENABLE/DISABLE
- ALTER TABLE ADD CONSTRAINT

Looking at the latest patch, it seems that in
AlterTableGetLockLevel@tablecmds.c we ought to put AT_ReAddConstraint,
AT_AddConstraintRecurse and AT_ProcessedConstraint under the same
banner as AT_AddConstraint. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Peter Geoghegan
On Thu, Jan 29, 2015 at 10:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I made the \u error be errcode(ERRCODE_INVALID_TEXT_REPRESENTATION)
 and errmsg(invalid input syntax for type json), by analogy to what's
 thrown for non-ASCII Unicode escapes in non-UTF8 encoding.  I'm not
 terribly happy with that, though.  ISTM that for both cases, this is
 not invalid syntax at all, but an implementation restriction that
 forces us to reject perfectly valid syntax.  So I think we ought to
 use a different ERRCODE and text message, though I'm not entirely
 sure what it should be instead.  ERRCODE_FEATURE_NOT_SUPPORTED is
 one possibility.

I personally prefer what you have here.

The point of JSONB is that we take a position on certain aspects like
this. We're bridging a pointedly loosey goosey interchange format,
JSON, with native PostgreSQL types. For example, we take a firm
position on encoding. The JSON type is a bit more permissive, to about
the extent that that's possible. The whole point is that we're
interpreting JSON data in a way that's consistent with *Postgres*
conventions. You'd have to interpret the data according to *some*
convention in order to do something non-trivial with it in any case,
and users usually want that.

It's really nice the way encoding is a strict implementation detail
within Postgres in general, in the sense that you know that if your
application code is concerned about encoding, you're probably thinking
about the problem incorrectly (at least once data has crossed the
database encoding border). MySQL's laxidasical attitudes here appear
to have been an enormous mistake.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 I looked into it, and it turns out that MongoDB does not accept NUL in
 at least some contexts (for object keys). Apparently it wasn't always
 so. MongoDB previously had a security issue that was fixed by
 introducing this restriction. Their JSON-centric equivalent of
 per-column privileges was for a time compromised, because NUL
 injection was possible:

 https://www.idontplaydarts.com/2011/02/mongodb-null-byte-injection-attacks/

 It's easy to bash MongoDB, but this is still an interesting data
 point. They changed this after the fact, and yet I can find no
 evidence of any grumbling about it from end users. No one really
 noticed.

Hoo, that's interesting.  Lends some support to my half-baked idea that
we might disallow NUL in object keys even if we are able to allow it
elsewhere in JSON strings.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Safe memory allocation functions

2015-01-29 Thread Michael Paquier
I wrote:
 Yes, this refactoring was good for testing actually...
Oops, I have been too hasty when sending previous patch, there was a
bug related to huge allocations. Patch correcting this bug is
attached.

Attached are as well two things I have used to test the new API:
- A hack refactoring the existing routines MemoryContextAlloc* to use
the extended API
- An extension with a function doing a direct call to the extended API
able to control the flags used:
CREATE FUNCTION blackhole_palloc(size bigint,
is_huge bool,
is_no_oom bool,
is_zero bool,
is_zero_aligned bool)

Here are some tests done on a small box of 384MB with direct calls of
the extended API:
=# create extension blackhole ;
CREATE EXTENSION
-- failure for normal allocation because size = 1GB
=# select blackhole_palloc(1024 * 1024 * 1024, false, false, false, false);
ERROR:  XX000: invalid memory alloc request size 1073741824
LOCATION:  MemoryContextAllocExtended, mcxt.c:628
-- Failure of OOM because normal allocation can be done, but no memory
=# select blackhole_palloc(1024 * 1024 * 1024 - 1, false, false, false, false);
ERROR:  53200: out of memory
DETAIL:  Failed on request of size 1073741823.
LOCATION:  MemoryContextAllocExtended, mcxt.c:639
-- No failure, bypassing OOM error
=# select blackhole_palloc(1024 * 1024 * 1024 - 1, false, true, false, false);
 blackhole_palloc
--
 null
(1 row)
-- Huge allocation, no error because OOM error is bypassed
=# select blackhole_palloc(1024 * 1024 * 1024, true, true, false, false);
 blackhole_palloc
--
 null
(1 row)
-- OOM error, huge allocation failure
=# select blackhole_palloc(1024 * 1024 * 1024, true, false, false, false);
ERROR:  53200: out of memory
DETAIL:  Failed on request of size 1073741824.
LOCATION:  MemoryContextAllocExtended, mcxt.c:639
-- Assertion failure, zero and zero aligned cannot be called at the same time
=# select blackhole_palloc(1024 * 1024, false, false, true, true);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
-- 
Michael
From 09bf9364a80dfe6426c91bdefe6ae2135e6f20c3 Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Fri, 30 Jan 2015 12:56:21 +0900
Subject: [PATCH 1/2] Create MemoryContextAllocExtended central routine for
 memory allocation

This new routine is the central point can be used by extensions and
third-part utilities in a more extensive way than the already present
routines MemoryContextAlloc, one of the control flags introduced being
particularly useful to avoid out-of-memory errors when allocation request
cannot be completed correctly.
---
 src/backend/utils/mmgr/mcxt.c | 57 +++
 src/include/utils/palloc.h| 12 +
 2 files changed, 69 insertions(+)

diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index c62922a..fce8a0e 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -603,6 +603,63 @@ MemoryContextCreate(NodeTag tag, Size size,
 }
 
 /*
+ * MemoryContextAllocExtended
+ *		Allocate space within the specified context using flag options
+ *		defined by caller.
+ *
+ * The following control flags can be used:
+ * - MCXT_ALLOC_HUGE, allocate possibly-expansive space. this is
+ *   equivalent to MemoryContextAllocHuge.
+ * - MCXT_ALLOC_NO_OOM, not fail in case of allocation request
+ *   failure and return NULL.
+ * - MCXT_ALLOC_ZERO, clear allocated memory using MemSetAligned.
+ * - MCXT_ALLOC_ZERO_ALIGNED, clear memory using MemSetLoop.
+ */
+void *
+MemoryContextAllocExtended(MemoryContext context, Size size, int flags)
+{
+	void	   *ret;
+
+	AssertArg(MemoryContextIsValid(context));
+	AssertNotInCriticalSection(context);
+
+	if (((flags  MCXT_ALLOC_HUGE) != 0  !AllocHugeSizeIsValid(size)) ||
+		((flags  MCXT_ALLOC_HUGE) == 0  !AllocSizeIsValid(size)))
+		elog(ERROR, invalid memory alloc request size %zu, size);
+
+	context-isReset = false;
+
+	ret = (*context-methods-alloc) (context, size);
+	if ((flags  MCXT_ALLOC_NO_OOM) == 0  ret == NULL)
+	{
+		MemoryContextStats(TopMemoryContext);
+		ereport(ERROR,
+(errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg(out of memory),
+ errdetail(Failed on request of size %zu., size)));
+	}
+
+	if (ret == NULL)
+		return NULL;
+
+	VALGRIND_MEMPOOL_ALLOC(context, ret, size);
+
+	/*
+	 * MemSetAligned and MemSetLoop should not be called in the same
+	 * context (see c.h for more details).
+	 */
+	Assert((flags  MCXT_ALLOC_ZERO) == 0 ||
+		   (flags  MCXT_ALLOC_ZERO_ALIGNED) == 0);
+
+	if ((flags  MCXT_ALLOC_ZERO) != 0)
+		MemSetAligned(ret, 0, size);
+	if ((flags  MCXT_ALLOC_ZERO_ALIGNED) != 0)
+		MemSetLoop(ret, 0, size);
+
+	return ret;
+}
+
+/*
  * MemoryContextAlloc
  *		Allocate space within the specified context.
  *
diff --git a/src/include/utils/palloc.h 

Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Andrew Dunstan


On 01/29/2015 05:39 PM, Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:

I have yet to understand what we fix by banning \u.  How is 
different from any other four-digit hexadecimal number that's not a
valid character in the current encoding?  What does banning that one
particular value do?

BTW, as to the point about encoding violations: we *already* ban \u
sequences that don't correspond to valid characters in the current
encoding.  The attempt to exclude U+ from the set of banned characters
was ill-advised, plain and simple.




Actually, unless the encoding is utf8 we ban all non-ascii unicode 
escapes even if they might designate a valid character in the current 
encoding. This was arrived at after some discussion here. So adding 
\u to the list of banned characters is arguably just making us more 
consistent.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TABLESAMPLE patch

2015-01-29 Thread Jim Nasby

On 1/29/15 10:44 AM, Bruce Momjian wrote:

On Thu, Jan 29, 2015 at 11:08:55AM -0500, Robert Haas wrote:

On Wed, Jan 28, 2015 at 5:19 AM, Petr Jelinek p...@2ndquadrant.com wrote:

Yes, that's my view too. I would generally be for that change also and it
would be worth it if the code was used in more than one place, but as it is
it seems like it will just add code/complexity for no real benefit. It would
make sense in case we used sequential scan node instead of the new node as
Amit also suggested, but I remain unconvinced that mixing sampling and
sequential scan into single scan node would be a good idea.


Based on previous experience, I expect that any proposal to merge
those nodes would get shot down by Tom with his laser-guided atomic
bazooka faster than you can say -1 from me regards tom lane.


Do we get illustrations with that?  ;-)  I want a poster for my wall!


+1. It should also be the tshirt for the next pgCon. ;)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade and rsync

2015-01-29 Thread David Steele
On 1/29/15 8:09 PM, Jim Nasby wrote:
 On 1/29/15 7:02 PM, David Steele wrote:
 On 1/29/15 7:55 PM, Jim Nasby wrote:
 On 1/29/15 6:25 PM, David Steele wrote:
 Safe backups can be done without LSNs provided you are willing to
 trust
 your timestamps.

 Which AFAICT simply isn't safe to do at all... except maybe with the
 manifest stuff you've talked about?

 Yes - that's what I'm talking about.  I had hoped to speak about this at
 PgConfNYC, but perhaps I can do it in a lightning talk instead.

 Sounds like maybe it should be part of our documentation too...

I think the warnings Bruce has added to the documentation about using
checksums are sufficient for now.  The manifest build and delay
methodology are part of PgBackRest, the backup solution I'm working on
as an alternative to barman, etc.  It's not something that can be
implemented trivially.

-- 
- David Steele
da...@pgmasters.net




signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Error check always bypassed in tablefunc.c

2015-01-29 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 So, I have been poking at this code a bit more and as the values of
 the parameters are passed as-is to the SQL queries that connectby
 generates internally (this is as well mentioned in the documentation
 here: http://www.postgresql.org/docs/devel/static/tablefunc.html), you
 can do quite fancy things by passing for example values of the type
 foo FROM table; -- or similar. Particularly, by enforcing a query
 returning only one column, or NULL values I am even able to crash the
 server. The interesting part is that even if compatConnectbyTupleDescs
 is enabled for each level, it is still possible to crash the server by
 passing for example NULL values casted to the same type, like that
 'NULL::text, NULL::text; --'.
 The attached patch fixes all those things, I have also enabled
 compatConnectbyTupleDescs to run at each level. I'll add it to the
 next CF as well to not lose track of it. This behavior has been like
 that forever...

Since this is a potential-core-dump fix, I went ahead and dealt with it
rather than waiting for the next CF.  I made a few adjustments:

* I thought that the way you changed the type compatibility tests was
overcomplicated and unnecessary.  As the code stands, there's no great
need to insist on type compatibility at all: if the printed form of
the constructed query's results is acceptable to the outer query's types,
it'll work, and otherwise will throw a reasonably intelligible error.
Now, we might well want to improve this code to skip the conversion to
text and back at some point, in which case we would need to insist on
a type match.  But in neither of those cases does it seem helpful to
ask whether there is a SQL type cast, as your patch was doing.  The
existence of a cast does not imply I/O representation compatibility,
so it's not in line with the current behavior, and if we want to skip
text conversion we'd prefer it's exactly the same type, which is the
check as it originally existed before it accidentally got broken.

What I did about this was to leave the behavior alone in back branches,
but insist on a type match in HEAD.  I think we can reasonably tighten
the type requirements in a new major release, but doing it in minor
releases is probably a bad idea.

* I thought it was odd to throw an error for NULL input, especially
an infinite recursion error.  However, even with your patch the code
would've dumped core on a null current_key value (since it would've
passed a null start_with down to the next recursion level).  What I
changed it to was to omit the recursion test (and hence print the row)
and then not recurse at a null.  This seems consistent and reasonable.

* I made a few other minor cleanups as well, in particular getting
rid of some unnecessary pstrdup() steps.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb, unicode escapes and escaped backslashes

2015-01-29 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 On Thu, Jan 29, 2015 at 10:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I made the \u error be errcode(ERRCODE_INVALID_TEXT_REPRESENTATION)
 and errmsg(invalid input syntax for type json), by analogy to what's
 thrown for non-ASCII Unicode escapes in non-UTF8 encoding.  I'm not
 terribly happy with that, though.  ISTM that for both cases, this is
 not invalid syntax at all, but an implementation restriction that
 forces us to reject perfectly valid syntax.  So I think we ought to
 use a different ERRCODE and text message, though I'm not entirely
 sure what it should be instead.  ERRCODE_FEATURE_NOT_SUPPORTED is
 one possibility.

 I personally prefer what you have here.

 The point of JSONB is that we take a position on certain aspects like
 this. We're bridging a pointedly loosey goosey interchange format,
 JSON, with native PostgreSQL types. For example, we take a firm
 position on encoding. The JSON type is a bit more permissive, to about
 the extent that that's possible. The whole point is that we're
 interpreting JSON data in a way that's consistent with *Postgres*
 conventions. You'd have to interpret the data according to *some*
 convention in order to do something non-trivial with it in any case,
 and users usually want that.

I quite agree with you, actually, in terms of that perspective.  But my
point remains: \u is not invalid JSON syntax, and neither is
\u1234.  If we choose to throw an error because we can't interpret or
process that according to our conventions, fine, but we should call it
something other than invalid syntax.

ERRCODE_UNTRANSLATABLE_CHARACTER or ERRCODE_CHARACTER_NOT_IN_REPERTOIRE
seem more apropos from here.  And I still think there's a case to be
made for ERRCODE_FEATURE_NOT_SUPPORTED, because it's at least possible
that we'd relax this restriction in future (eg, allow Unicode characters
that can be converted to the database's encoding).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers