Re: [HACKERS] btree_gin and ranges

2014-12-26 Thread Teodor Sigaev

Teodor's patch could use some more comments. The STOP_SCAN/MATCH_SCAN/CONT_SCAN
macros are a good idea, but they probably should go into
src/include/access/gin.h so that they can be used in all compare_partial
implementations.


STOP_SCAN/MATCH_SCAN/CONT_SCAN macros are moved to gin's header, and comments 
are improved.


Split patch to two: gin and module


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


gin_macros.patch.gz
Description: GNU Zip compressed data


btree_gin_range-3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2014-12-26 Thread Pavel Stehule
2014-12-25 22:23 GMT+01:00 Alex Shulgin a...@commandprompt.com:

 Trent Shipley trent_ship...@qwest.net writes:

  On Friday 2007-12-14 16:22, Tom Lane wrote:
  Neil Conway ne...@samurai.com writes:
   By modifying COPY: COPY IGNORE ERRORS or some such would instruct COPY
   to drop (and log) rows that contain malformed data. That is, rows with
   too many or too few columns, rows that result in constraint
 violations,
   and rows containing columns where the data type's input function
 raises
   an error. The last case is the only thing that would be a bit tricky
 to
   implement, I think: you could use PG_TRY() around the
 InputFunctionCall,
   but I guess you'd need a subtransaction to ensure that you reset your
   state correctly after catching an error.
 
  Yeah.  It's the subtransaction per row that's daunting --- not only the
  cycles spent for that, but the ensuing limitation to 4G rows imported
  per COPY.
 
  You could extend the COPY FROM syntax with a COMMIT EVERY n clause.  This
  would help with the 4G subtransaction limit.  The cost to the ETL
 process is
  that a simple rollback would not be guaranteed send the process back to
 it's
  initial state.  There are easy ways to deal with the rollback issue
 though.
 
  A {NO} RETRY {USING algorithm} clause might be useful.   If the NO RETRY
  option is selected then the COPY FROM can run without subtransactions
 and in
  excess of the 4G per transaction limit.  NO RETRY should be the default
 since
  it preserves the legacy behavior of COPY FROM.
 
  You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not
 give the
  option of sending exceptions to a table since they are presumably
 malformed,
  otherwise they would not be exceptions.  (Users should re-process
 exception
  files if they want an if good then table a else exception to table b ...)
 
  EXCEPTIONS TO and NO RETRY would be mutually exclusive.
 
 
  If we could somehow only do a subtransaction per failure, things would
  be much better, but I don't see how.

 Hello,

 Attached is a proof of concept patch for this TODO item.  There is no
 docs yet, I just wanted to know if approach is sane.

 The added syntax is like the following:

   COPY [table] FROM [file/program/stdin] EXCEPTIONS TO [file or stdout]

 The way it's done it is abusing Copy Both mode and from my limited
 testing, that seems to just work.  The error trapping itself is done
 using PG_TRY/PG_CATCH and can only catch formatting or before-insert
 trigger errors, no attempt is made to recover from a failed unique
 constraint, etc.

 Example in action:

 postgres=# \d test_copy2
   Table public.test_copy2
  Column |  Type   | Modifiers
 +-+---
  id | integer |
  val| integer |

 postgres=# copy test_copy2 from program 'seq 3' exceptions to stdout;
 1
 NOTICE:  missing data for column val
 CONTEXT:  COPY test_copy2, line 1: 1
 2
 NOTICE:  missing data for column val
 CONTEXT:  COPY test_copy2, line 2: 2
 3
 NOTICE:  missing data for column val
 CONTEXT:  COPY test_copy2, line 3: 3
 NOTICE:  total exceptions ignored: 3

 postgres=# \d test_copy1
   Table public.test_copy1
  Column |  Type   | Modifiers
 +-+---
  id | integer | not null

 postgres=# set client_min_messages to warning;
 SET
 postgres=# copy test_copy1 from program 'ls /proc' exceptions to stdout;
 ...
 vmstat
 zoneinfo
 postgres=#

 Limited performance testing shows no significant difference between
 error-catching and plain code path.  For example, timing

   copy test_copy1 from program 'seq 100' [exceptions to stdout]

 shows similar numbers with or without the added exceptions to clause.

 Now that I'm sending this I wonder if the original comment about the
 need for subtransaction around every loaded line still holds.  Any
 example of what would be not properly rolled back by just PG_TRY?


this method is unsafe .. exception handlers doesn't free memory usually -
there is risk of memory leaks, source leaks

you can enforce same performance with block subtransactions - when you use
subtransaction for 1000 rows, then impact of subtransactions is minimal

when block fails, then you can use row level subtransaction - it works well
when you expect almost correct data.

Regards

Pavel



 Happy hacking!
 --
 Alex



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2014-12-26 Thread Pavel Stehule
2014-12-26 11:41 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:



 2014-12-25 22:23 GMT+01:00 Alex Shulgin a...@commandprompt.com:

 Trent Shipley trent_ship...@qwest.net writes:

  On Friday 2007-12-14 16:22, Tom Lane wrote:
  Neil Conway ne...@samurai.com writes:
   By modifying COPY: COPY IGNORE ERRORS or some such would instruct
 COPY
   to drop (and log) rows that contain malformed data. That is, rows
 with
   too many or too few columns, rows that result in constraint
 violations,
   and rows containing columns where the data type's input function
 raises
   an error. The last case is the only thing that would be a bit tricky
 to
   implement, I think: you could use PG_TRY() around the
 InputFunctionCall,
   but I guess you'd need a subtransaction to ensure that you reset your
   state correctly after catching an error.
 
  Yeah.  It's the subtransaction per row that's daunting --- not only the
  cycles spent for that, but the ensuing limitation to 4G rows imported
  per COPY.
 
  You could extend the COPY FROM syntax with a COMMIT EVERY n clause.
 This
  would help with the 4G subtransaction limit.  The cost to the ETL
 process is
  that a simple rollback would not be guaranteed send the process back to
 it's
  initial state.  There are easy ways to deal with the rollback issue
 though.
 
  A {NO} RETRY {USING algorithm} clause might be useful.   If the NO RETRY
  option is selected then the COPY FROM can run without subtransactions
 and in
  excess of the 4G per transaction limit.  NO RETRY should be the default
 since
  it preserves the legacy behavior of COPY FROM.
 
  You could have an EXCEPTIONS TO {filename|STDERR} clause. I would not
 give the
  option of sending exceptions to a table since they are presumably
 malformed,
  otherwise they would not be exceptions.  (Users should re-process
 exception
  files if they want an if good then table a else exception to table b
 ...)
 
  EXCEPTIONS TO and NO RETRY would be mutually exclusive.
 
 
  If we could somehow only do a subtransaction per failure, things would
  be much better, but I don't see how.

 Hello,

 Attached is a proof of concept patch for this TODO item.  There is no
 docs yet, I just wanted to know if approach is sane.

 The added syntax is like the following:

   COPY [table] FROM [file/program/stdin] EXCEPTIONS TO [file or stdout]

 The way it's done it is abusing Copy Both mode and from my limited
 testing, that seems to just work.  The error trapping itself is done
 using PG_TRY/PG_CATCH and can only catch formatting or before-insert
 trigger errors, no attempt is made to recover from a failed unique
 constraint, etc.

 Example in action:

 postgres=# \d test_copy2
   Table public.test_copy2
  Column |  Type   | Modifiers
 +-+---
  id | integer |
  val| integer |

 postgres=# copy test_copy2 from program 'seq 3' exceptions to stdout;
 1
 NOTICE:  missing data for column val
 CONTEXT:  COPY test_copy2, line 1: 1
 2
 NOTICE:  missing data for column val
 CONTEXT:  COPY test_copy2, line 2: 2
 3
 NOTICE:  missing data for column val
 CONTEXT:  COPY test_copy2, line 3: 3
 NOTICE:  total exceptions ignored: 3

 postgres=# \d test_copy1
   Table public.test_copy1
  Column |  Type   | Modifiers
 +-+---
  id | integer | not null

 postgres=# set client_min_messages to warning;
 SET
 postgres=# copy test_copy1 from program 'ls /proc' exceptions to stdout;
 ...
 vmstat
 zoneinfo
 postgres=#

 Limited performance testing shows no significant difference between
 error-catching and plain code path.  For example, timing

   copy test_copy1 from program 'seq 100' [exceptions to stdout]

 shows similar numbers with or without the added exceptions to clause.

 Now that I'm sending this I wonder if the original comment about the
 need for subtransaction around every loaded line still holds.  Any
 example of what would be not properly rolled back by just PG_TRY?


 this method is unsafe .. exception handlers doesn't free memory usually -
 there is risk of memory leaks, source leaks

 you can enforce same performance with block subtransactions - when you use
 subtransaction for 1000 rows, then impact of subtransactions is minimal

 when block fails, then you can use row level subtransaction - it works
 well when you expect almost correct data.


Two years ago I wrote a extension that did it - but I have not time to
finish it and push to upstream.

Regards

Pavel



 Regards

 Pavel



 Happy hacking!
 --
 Alex



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers





ftcopy-04.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Additional role attributes superuser review

2014-12-26 Thread Magnus Hagander
On Wed, Dec 24, 2014 at 6:48 PM, Adam Brightwell 
adam.brightw...@crunchydatasolutions.com wrote:

 All,

 I want to revive this thread and continue to move these new role
 attributes forward.

 In summary, the ultimate goal is to include new role attributes for common
 operations which currently require superuser privileges.

 Initially proposed were the following attributes:

 * BACKUP - allows role to perform backup operations
 * LOGROTATE - allows role to rotate log files
 * MONITOR - allows role to view pg_stat_* details
 * PROCSIGNAL - allows role to signal backend processes

 It seems that PROCSIGNAL and MONITOR were generally well received and
 probably don't warrant much more discussion at this point.

 However, based on previous discussions, there seemed to be some
 uncertainty on how to handle BACKUP and LOGROTATE.

 Concerns:

 * LOGROTATE - only associated with one function/operation.
 * BACKUP - perceived to be too broad of a permission as it it would
 provide the ability to run pg_start/stop_backend and the xlog related
 functions.  It is general sentiment is that these should be handled as
 separate privileges.

* BACKUP - preferred usage is with pg_dump to giving a user the ability to
 run pg_dump on the whole database without being superuser.

 Previous Recommendations:

 * LOGROTATE - Use OPERATOR - concern was expressed that this might be too
 general of an attribute for this purpose.  Also, concern for privilege
 'upgrades' as it includes more capabilities in later releases.
 * LOGROTATE - Use LOG_OPERATOR - generally accepted, but concern was raise
 for using extraneous descriptors such as '_OPERATOR' and '_ADMIN', etc.
 * BACKUP - Use WAL_CONTROL for pg_start/stop_backup - no major
 disagreement, though same concern regarding extraneous descriptors.
 * BACKUP - Use XLOG_OPERATOR for xlog operations - no major disagreement,
 though same concern regarding extraneous descriptors.
 * BACKUP - Use BACKUP for granting non-superuser ability to run pg_dump on
 whole database.

 Given the above and previous discussions:

 I'd like to propose the following new role attributes:

 BACKUP - allows role to perform pg_dump* backups of whole database.


I'd suggest it's called DUMP if that's what it allows, to keep it separate
from the backup parts.



 WAL - allows role to execute pg_start_backup/pg_stop_backup functions.

 XLOG - allows role to execute xlog operations.


That seems really bad names, IMHO. Why? Because we use WAL and XLOG
throughout documentation and parameters and code to mean *the same thing*.
And here they'd suddenly mean different things. If we need them as separate
privileges, I think we need much better names. (And a better description -
what is xlog operations really?)

And the one under WAL would be very similar to what REPLICATION does today.
Or are you saying that it should specifically *not* allow base backups done
through the replication protocol, only the exclusive ones?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates

2014-12-26 Thread David Rowley
On 24 December 2014 at 16:04, Andreas Karlsson andr...@proxel.se wrote:

On 12/16/2014 11:04 AM, David Rowley wrote: These are some very promising
 performance increases.


 I've done a quick pass of reading the patch. I currently don't have a
 system with a 128bit int type, but I'm working on that.


 Sorry for taking some time to get back. I have been busy before Christmas.
 A new version of the patch is attached.


Ok I've had another look at this, and I think the only things that I have
to say have been mentioned already:

1. Do we need to keep the 128 byte aggregate state size for machines
without 128 bit ints? This has been reduced to 48 bytes in the patch, which
is in favour code being compiled with a compiler which has 128 bit ints.  I
kind of think that we need to keep the 128 byte estimates for compilers
that don't support int128, but I'd like to hear any counter arguments.

2. References to int16 meaning 16 bytes. I'm really in two minds about
this, it's quite nice to keep the natural flow, int4, int8, int16, but I
can't help think that this will confuse someone one day. I think it'll be a
long time before it confused anyone if we called it int128 instead, but I'm
not that excited about seeing it renamed either. I'd like to hear what
others have to say... Is there a chance that some sql standard in the
distant future will have HUGEINT and we might regret not getting the
internal names nailed down?

I also checked the performance of some windowing function calls, since
these call the final function for each row, I thought I'd better make sure
there was no regression as the final function must perform a conversion
from int128 to numeric for each row.

It seems there's still an increase in performance:


Setup:
create table bi_win (i bigint primary key);
insert into bi_win select x.x from generate_series(1,1) x(x);
vacuum analyze;

Query:
select sum(i) over () from bi_win;

** Master
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 6567
latency average: 9.137 ms
tps = 109.445841 (including connections establishing)
tps = 109.456941 (excluding connections establishing)

** Patched
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 7841
latency average: 7.652 ms
tps = 130.670253 (including connections establishing)
tps = 130.675743 (excluding connections establishing)

Setup:
create table i_win (i int primary key);
insert into i_win select x.x from generate_series(1,1) x(x);
vacuum analyze;

Query:
select stddev(i) over () from i_win;

** Master
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 5084
latency average: 11.802 ms
tps = 84.730362 (including connections establishing)
tps = 84.735693 (excluding connections establishing)

** Patched
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 7557
latency average: 7.940 ms
tps = 125.934787 (including connections establishing)
tps = 125.943176 (excluding connections establishing)

Regards

David Rowley


Re: [HACKERS] [PATCH] explain sortorder

2014-12-26 Thread Arne Scheffer
Heikki Linnakangas hlinnakangas(at)vmware(dot)com writes:
 I would suggest just adding the information to the Sort Key line. As
 long as you don't print the modifiers when they are defaults (ASC and
 NULLS LAST), we could print the information even in non-VERBOSE mode.

+1.  I had assumed without looking that that was what it did already,
else I'd have complained too.

   regards, tom lane

We will change the patch according to Heikkis suggestions.

A nice Christmas  all the best in the New Year

Arne Scheffer

http://www.uni-muenster.de/ZIV/Mitarbeiter/ArneScheffer.shtml


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Kevin Grittner
nikita.y.vol...@mail.ru nikita.y.vol...@mail.ru wrote:

 Executing concurrent transactions inserting the same value of a
 unique key fails with the duplicate key error under code
 23505 instead of any of transaction conflict errors with a
 40*** code.

This is true, and can certainly be inconvenient when using
serializable transactions to simplify handling of race conditions,
because you can't just test for a SQLSTATE of '40001' or '40P01' to
indicate the need to retry the transaction.  You have two
reasonable ways to avoid duplicate keys if the values are synthetic
and automatically generated.  One is to use a SEQUENCE object to
generate the values.  The other (really only recommended if gaps in
the sequence are a problem) is to have the serializable transaction
update a row to claim the number.

Otherwise you need to consider errors related to duplicates as
possibly being caused by a concurrent transaction.  You may want to
do one transaction retry in such cases, and fail if an identical
error is seen.  Keep in mind that these errors do not allow
serialization anomalies to appear in the committed data, so are
arguably not violations of ACID principles -- more of a wart on the
otherwise clean technique of using serializable transactions to
simplify application programming under concurrency.

Thinking about it just now I think we might be able to generate a
write conflict instead of a duplicate key error for this case by
checking the visibility information for the duplicate row.  It
might not even have a significant performance impact, since we need
to check visibility information to generate the duplicate key
error.  That would still leave similar issues (where similar
arguments can be made) relating to foreign keys; but those can
largely be addressed already by declaring the constraints to be
DEFERRED -- and anyway, that would be a separate fix.

I'm moving this discussion to the -hackers list so that I can ask
other developers:

Are there any objections to generating a write conflict instead of
a duplicate key error if the duplicate key was added by a
concurrent transaction?  Only for transactions at isolation level
REPEATABLE READ or higher?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Still, I don't think this is a reasonable test design.  We have
 absolutely no idea what behaviors are being triggered in the other
 tests, except that they are unrelated to what those tests think they
 are testing.

 I can of course move it to a separate parallel test, but I don't think
 that should be really necessary.

I've not proven this rigorously, but it seems obvious in hindsight:
what's happening is that when the object_address test drops everything
with DROP CASCADE, other processes are sometimes just starting to execute
the event trigger when the DROP commits.  When they go to look up the
trigger function, they don't find it, leading to cache lookup failed for
function.  The fact that the complained-of OID is slightly variable, but
always in the range of OIDs that would be getting assigned around this
point in a make check run, buttresses the theory.

I thought about changing the object_address test so that it explicitly
drops the event trigger first.  But that would not be a fix, it would
just make the timing harder to hit (ie, a victim process would need to
lose control for longer at the critical point).

Since I remain of the opinion that a test called object_address has no
damn business causing global side-effects, I think there are two
reasonable fixes:

1. Remove the event trigger.  This would slightly reduce the test's
coverage.

2. Run that whole test as a single transaction, so that the event trigger
is created and dropped in one transaction and is never seen as valid by
any concurrent test.

A long-term idea is to try to fix things so that there's sufficient
locking to make dropping an event trigger and immediately dropping its
trigger function safe.  But I'm not sure that's either possible or a good
idea (the lock obtained by DROP would bring the entire database to a
standstill ...).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Marko Tiikkaja

On 2014-12-26 17:23, Kevin Grittner wrote:

Are there any objections to generating a write conflict instead of
a duplicate key error if the duplicate key was added by a
concurrent transaction?  Only for transactions at isolation level
REPEATABLE READ or higher?


Is it possible to distinguish between an actual write conflict and a 
completely unrelated UNIQUE constraint ending up violated for some reason?



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Bruce Momjian
On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote:
 Hi.
 
 Here's a proposed patch to use CPUID at startup to determine if the
 SSE4.2 CRC instructions are available, to use them instead of the
 slice-by-8 implementation (posted earlier).
 
 A few notes:
 
 1. GCC has included cpuid.h since 4.3.0, so I figured it was safe to
use. It can be replaced with some inline assembly otherwise.
 
 2. I've also used the crc32b/crc32q instructions directly rather than
using .bytes to encode the instructions; bintuils versions since
2007 or so have supported them.
 
 3. I've included the MSVC implementation mostly as an example of how to
extend this to different compilers/platforms. It's written according
to the documentation for MSVC intrinsics, but I have not tested it.
Suggestions/improvements are welcome.

Uh, what happens if the system is compiled on a different CPU that it is
run on?  Seems we would need a run-time CPU test.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Are there any objections to generating a write conflict instead of
 a duplicate key error if the duplicate key was added by a
 concurrent transaction?

Yes.  This will deliver a less meaningful error code, *and* break
existing code that is expecting the current behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Andres Freund
On December 26, 2014 4:50:33 PM CET, Bruce Momjian br...@momjian.us wrote:
On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote:
 Hi.
 
 Here's a proposed patch to use CPUID at startup to determine if the
 SSE4.2 CRC instructions are available, to use them instead of the
 slice-by-8 implementation (posted earlier).
 
 A few notes:
 
 1. GCC has included cpuid.h since 4.3.0, so I figured it was safe to
use. It can be replaced with some inline assembly otherwise.
 
 2. I've also used the crc32b/crc32q instructions directly rather than
using .bytes to encode the instructions; bintuils versions since
2007 or so have supported them.
 
 3. I've included the MSVC implementation mostly as an example of how
to
extend this to different compilers/platforms. It's written
according
to the documentation for MSVC intrinsics, but I have not tested
it.
Suggestions/improvements are welcome.

Uh, what happens if the system is compiled on a different CPU that it
is
run on?  Seems we would need a run-time CPU test.

That's the cpuid thing mentioned above.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Tom Lane wrote:
  Still, I don't think this is a reasonable test design.  We have
  absolutely no idea what behaviors are being triggered in the other
  tests, except that they are unrelated to what those tests think they
  are testing.
 
  I can of course move it to a separate parallel test, but I don't think
  that should be really necessary.
 
 I've not proven this rigorously, but it seems obvious in hindsight:
 what's happening is that when the object_address test drops everything
 with DROP CASCADE, other processes are sometimes just starting to execute
 the event trigger when the DROP commits.  When they go to look up the
 trigger function, they don't find it, leading to cache lookup failed for
 function.

Hm, maybe we can drop the event trigger explicitely first, then wait a
little bit, then drop the remaining objects with DROP CASCADE?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 I've not proven this rigorously, but it seems obvious in hindsight:
 what's happening is that when the object_address test drops everything
 with DROP CASCADE, other processes are sometimes just starting to execute
 the event trigger when the DROP commits.  When they go to look up the
 trigger function, they don't find it, leading to cache lookup failed for
 function.

 Hm, maybe we can drop the event trigger explicitely first, then wait a
 little bit, then drop the remaining objects with DROP CASCADE?

As I said, that's no fix; it just makes the timing harder to hit.  Another
process could be paused at the critical point for longer than whatever a
little bit is.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Tom Lane wrote:
  I've not proven this rigorously, but it seems obvious in hindsight:
  what's happening is that when the object_address test drops everything
  with DROP CASCADE, other processes are sometimes just starting to execute
  the event trigger when the DROP commits.  When they go to look up the
  trigger function, they don't find it, leading to cache lookup failed for
  function.
 
  Hm, maybe we can drop the event trigger explicitely first, then wait a
  little bit, then drop the remaining objects with DROP CASCADE?
 
 As I said, that's no fix; it just makes the timing harder to hit.  Another
 process could be paused at the critical point for longer than whatever a
 little bit is.

Yeah, I was thinking we could play some games with the currently running
XIDs from a txid_snapshot or some such, with a reasonable upper limit on
the waiting time (for the rare cases with a server doing other stuff
with long-running transactions.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Hm, maybe we can drop the event trigger explicitely first, then wait a
 little bit, then drop the remaining objects with DROP CASCADE?

 As I said, that's no fix; it just makes the timing harder to hit.  Another
 process could be paused at the critical point for longer than whatever a
 little bit is.

 Yeah, I was thinking we could play some games with the currently running
 XIDs from a txid_snapshot or some such, with a reasonable upper limit on
 the waiting time (for the rare cases with a server doing other stuff
 with long-running transactions.)

Whether that's sane or not, the whole problem is so far out-of-scope for
a test of pg_get_object_address() that it's not even funny.  I think
we should adopt one of the two fixes I recommended and call it good.
If you want to work on making DROP EVENT TRIGGER safer in the long run,
that can be a separate activity.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Bruce Momjian
On Fri, Dec 26, 2014 at 04:52:58PM +0100, Andres Freund wrote:
 On December 26, 2014 4:50:33 PM CET, Bruce Momjian br...@momjian.us wrote:
 On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote:
  Hi.
  
  Here's a proposed patch to use CPUID at startup to determine if the
  SSE4.2 CRC instructions are available, to use them instead of the
  slice-by-8 implementation (posted earlier).
  
  A few notes:
  
  1. GCC has included cpuid.h since 4.3.0, so I figured it was safe to
 use. It can be replaced with some inline assembly otherwise.
  
  2. I've also used the crc32b/crc32q instructions directly rather than
 using .bytes to encode the instructions; bintuils versions since
 2007 or so have supported them.
  
  3. I've included the MSVC implementation mostly as an example of how
 to
 extend this to different compilers/platforms. It's written
 according
 to the documentation for MSVC intrinsics, but I have not tested
 it.
 Suggestions/improvements are welcome.
 
 Uh, what happens if the system is compiled on a different CPU that it
 is
 run on?  Seems we would need a run-time CPU test.
 
 That's the cpuid thing mentioned above.

Oh, so cpuid is not a macro exported by the compiler but a C API.  Good.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Tom Lane wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Hm, maybe we can drop the event trigger explicitely first, then wait a
  little bit, then drop the remaining objects with DROP CASCADE?
 
  As I said, that's no fix; it just makes the timing harder to hit.  Another
  process could be paused at the critical point for longer than whatever a
  little bit is.
 
  Yeah, I was thinking we could play some games with the currently running
  XIDs from a txid_snapshot or some such, with a reasonable upper limit on
  the waiting time (for the rare cases with a server doing other stuff
  with long-running transactions.)
 
 Whether that's sane or not, the whole problem is so far out-of-scope for
 a test of pg_get_object_address() that it's not even funny.  I think
 we should adopt one of the two fixes I recommended and call it good.

I think dropping the part involving an event trigger from the test is
reasonable.  I will go do that.

 If you want to work on making DROP EVENT TRIGGER safer in the long run,
 that can be a separate activity.

This sounds like a huge project -- it's not like event triggers are the
only objects in the system where this is an issue, is it?  I'm sure
there is value in fixing it, but I have enough other projects.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Andres Freund
On December 26, 2014 6:05:34 PM CET, Bruce Momjian br...@momjian.us wrote:
On Fri, Dec 26, 2014 at 04:52:58PM +0100, Andres Freund wrote:
 On December 26, 2014 4:50:33 PM CET, Bruce Momjian br...@momjian.us
wrote:
 On Thu, Dec 25, 2014 at 11:57:29AM +0530, Abhijit Menon-Sen wrote:
  Hi.
  
  Here's a proposed patch to use CPUID at startup to determine if
the
  SSE4.2 CRC instructions are available, to use them instead of the
  slice-by-8 implementation (posted earlier).
  
  A few notes:
  
  1. GCC has included cpuid.h since 4.3.0, so I figured it was safe
to
 use. It can be replaced with some inline assembly otherwise.
  
  2. I've also used the crc32b/crc32q instructions directly rather
than
 using .bytes to encode the instructions; bintuils versions
since
 2007 or so have supported them.
  
  3. I've included the MSVC implementation mostly as an example of
how
 to
 extend this to different compilers/platforms. It's written
 according
 to the documentation for MSVC intrinsics, but I have not tested
 it.
 Suggestions/improvements are welcome.
 
 Uh, what happens if the system is compiled on a different CPU that
it
 is
 run on?  Seems we would need a run-time CPU test.
 
 That's the cpuid thing mentioned above.

Oh, so cpuid is not a macro exported by the compiler but a C API. 
Good

Neither, really. It's a instruction returning CPU capabilities. 

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:

 Are there any objections to generating a write conflict instead
 of a duplicate key error if the duplicate key was added by a
 concurrent transaction?

 Yes.  This will deliver a less meaningful error code,

That depends entirely on whether you care more about whether the
problem was created by a concurrent transaction or exactly how that
concurrent transaction created the problem.  For those using
serializable transactions to manage concurrency the former is at
least an order of magnitude more important.  This is not the first
time getting a constraint violation SQLSTATE for the actions of a
concurrent serializable transaction has been reported as a bug.
Going from memory, I think this might be about the fifth time users
have reported it as a bug or potential bug on these lists.

People using serializable transactions normally run all queries
through common code with will retry the transaction from the start
if there is a SQLSTATE starting with '40' (or perhaps picking out
the specific codes '40001' and '40P01').  Not doing so for *some*
types of errors generated by concurrent transactions reduces the
application's level of user-friendliness, and may introduce
surprising bugs.  In this particular case the OP wants to do one
thing if a row with a paricular value for a unique index exists,
and something different if it doesn't.  If we generate the write
conflict for the case that it is concurrently added, it can retry
the transaction and do one or the other; if we don't pay attention
to that, they need weird heuristics for the third case.  That
really is not more meaningful or useful.

 *and* break existing code that is expecting the current behavior.

Possibly, but my experience is more that failure to behave the way
I suggest is biting people and causing them a lot of extra work and
pain.  I would be fine with limiting the new behavior to
serializable transactions, since that seems to be where people want
this behavior.  It would bring us closer to the transaction will
run as though it were the only transaction running or roll back
with a serialization failure without having to add caveats and
exceptions.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Yes.  This will deliver a less meaningful error code,

 That depends entirely on whether you care more about whether the
 problem was created by a concurrent transaction or exactly how that
 concurrent transaction created the problem.

Just for starters, a 40XXX error report will fail to provide the
duplicated key's value.  This will be a functional regression,
on top of breaking existing code.

I think an appropriate response to these complaints is to fix the
documentation to point out that duplicate-key violations may also
be worthy of retries.  (I sort of thought it did already, actually,
but I see no mention of the issue in chapter 13.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Andres Freund
On December 26, 2014 6:10:51 PM CET, Alvaro Herrera alvhe...@2ndquadrant.com 
wrote:
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Tom Lane wrote:
  Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Hm, maybe we can drop the event trigger explicitely first, then
wait a
  little bit, then drop the remaining objects with DROP CASCADE?
 
  As I said, that's no fix; it just makes the timing harder to hit. 
Another
  process could be paused at the critical point for longer than
whatever a
  little bit is.
 
  Yeah, I was thinking we could play some games with the currently
running
  XIDs from a txid_snapshot or some such, with a reasonable upper
limit on
  the waiting time (for the rare cases with a server doing other
stuff
  with long-running transactions.)
 
 Whether that's sane or not, the whole problem is so far out-of-scope
for
 a test of pg_get_object_address() that it's not even funny.  I think
 we should adopt one of the two fixes I recommended and call it good.

I think dropping the part involving an event trigger from the test is
reasonable.  I will go do that.

 If you want to work on making DROP EVENT TRIGGER safer in the long
run,
 that can be a separate activity.

This sounds like a huge project -- it's not like event triggers are the
only objects in the system where this is an issue, is it?  I'm sure
there is value in fixing it, but I have enough other projects.

Can't we just move the test to run without parallelism? Its quite quick, so I 
don't it'd have noticeable consequences timewise.


-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 Can't we just move the test to run without parallelism? Its quite quick, so I 
 don't it'd have noticeable consequences timewise.

That just leaves the door open for somebody to add more tests parallel to
it in future.

TBH, I think we could have done without this test altogether; but if we're
going to have it, a minimum expectation is that it not be hazardous to
other tests around it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Alvaro Herrera
Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  Can't we just move the test to run without parallelism? Its quite quick, so 
  I don't it'd have noticeable consequences timewise.
 
 That just leaves the door open for somebody to add more tests parallel to
 it in future.

I've been long wanted to add declarative dependencies to tests: each
test file would declare what other tests it depends on, and we would
have a special clause to state this one must not be run in concurrence
with anything else.  Of course, this is just wishful thinking at this
point.

 TBH, I think we could have done without this test altogether; but if we're
 going to have it, a minimum expectation is that it not be hazardous to
 other tests around it.

The number of assertion failures in get_object_address without all the
sanity checks I added in pg_get_object_address was a bit surprising.
That's the whole reason I decided to add the test.  I don't want to
blindly assume that all cases will work nicely in the future,
particularly as other object types are added.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Alvaro Herrera
Andres Freund wrote:

 This sounds like a huge project -- it's not like event triggers are the
 only objects in the system where this is an issue, is it?  I'm sure
 there is value in fixing it, but I have enough other projects.
 
 Can't we just move the test to run without parallelism? Its quite
 quick, so I don't it'd have noticeable consequences timewise.

(I got this a minute too late anyhow.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Some other odd buildfarm failures

2014-12-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 TBH, I think we could have done without this test altogether; but if we're
 going to have it, a minimum expectation is that it not be hazardous to
 other tests around it.

 The number of assertion failures in get_object_address without all the
 sanity checks I added in pg_get_object_address was a bit surprising.
 That's the whole reason I decided to add the test.  I don't want to
 blindly assume that all cases will work nicely in the future,
 particularly as other object types are added.

I'm surprised then that you didn't prefer the other solution (wrap the
whole test in a single transaction).  But we've probably spent more
time on this than it deserves.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Bruce Momjian
On Fri, Dec 26, 2014 at 04:52:58PM +0100, Andres Freund wrote:
 Uh, what if the system is compiled on a different CPU that it
 is
 run on?  Seems we would need a run-time CPU test.
 
 That's the cpuid thing mentioned above.

Is this something that could potentially change the data stored on disk?
Does pg_upgrade need to check for changes in this area?  Is the
detection exposed by pg_controldata?  Could this affect running the data
directory on a different CPU?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Abhijit Menon-Sen
At 2014-12-26 13:11:43 -0500, br...@momjian.us wrote:

 Is this something that could potentially change the data stored on
 disk? Does pg_upgrade need to check for changes in this area?  Is the
 detection exposed by pg_controldata?  Could this affect running the
 data directory on a different CPU?

No to all.

Subsequent to Heikki's change (already in master) to use the CRC-32C
algorithm (instead of the earlier mistakenly-reflected-but-not-quite
one), both the slice-by-8 software implementation posted earlier and
the SSE4.2 CRC32* instructions will compute exactly the same value.

(Yes, I have verified this in both cases.)

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:

 Just for starters, a 40XXX error report will fail to provide the
 duplicated key's value.  This will be a functional regression,

Not if, as is normally the case, the transaction is retried from
the beginning on a serialization failure.  Either the code will
check for a duplicate (as in the case of the OP on this thread) and
they won't see the error, *or* the the transaction which created
the duplicate key will have committed before the start of the retry
and you will get the duplicate key error.

 I think an appropriate response to these complaints is to fix the
 documentation to point out that duplicate-key violations may also
 be worthy of retries.

 but I see no mention of the issue in chapter 13.)

I agree that's the best we can do for stable branches, and worth
doing.

It would be interesting to hear from others who have rely on 
serializable transactions in production environments about what 
makes sense to them.  This is probably the wrong list to find such 
people directly; but I seem to recall Josh Berkus has a lot of 
clients who do.  Josh?  Any opinion on this thread?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

2014-12-26 Thread Alexey Vasiliev
 Thanks for suggestions.

Patch updated.


Mon, 22 Dec 2014 12:07:06 +0900 от Michael Paquier michael.paqu...@gmail.com:
On Tue, Nov 4, 2014 at 6:25 AM, Alexey Vasiliev  leopard...@inbox.ru  wrote:
 Added new patch.
Seems useful to me to be able to tune this interval of time.

I would simply reword the documentation as follows:
If varnamerestore_command/ returns nonzero exit status code, retry
command after the interval of time specified by this parameter.
Default value is literal5s/.

Also, I think that it would be a good idea to error out if this
parameter has a value of let's say, less than 1s instead of doing a
check for a positive value in the waiting latch. On top of that, the
default value of restore_command_retry_interval should be 5000 and not
0 to simplify the code.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Alexey Vasiliev
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index ef78bc0..53ccf13 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -145,6 +145,29 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p'  # Windows
   /listitem
  /varlistentry
 
+ varlistentry id=restore-command-retry-interval xreflabel=restore_command_retry_interval
+  termvarnamerestore_command_retry_interval/varname (typeinteger/type)
+  indexterm
+primaryvarnamerestore_command_retry_interval/ recovery parameter/primary
+  /indexterm
+  /term
+  listitem
+   para
+If varnamerestore_command/ returns nonzero exit status code, retry
+command after the interval of time specified by this parameter.
+Default value is literal5s/.
+   /para
+   para
+This is useful, if I using for restore of wal logs some
+external storage (like AWS S3) and no matter what the slave database
+will lag behind the master. The problem, what for each request to
+AWS S3 need to pay, what is why for N nodes, which try to get next
+wal log each 5 seconds will be bigger price, than for example each
+30 seconds.
+   /para
+  /listitem
+ /varlistentry
+
 /variablelist
 
   /sect1
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index b777400..7ed6f3b 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -58,6 +58,11 @@
 #
 #recovery_end_command = ''
 #
+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
+#
+#restore_command_retry_interval = 5s
+#
 #---
 # RECOVERY TARGET PARAMETERS
 #---
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e54..02c55a8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -235,6 +235,7 @@ static TimestampTz recoveryTargetTime;
 static char *recoveryTargetName;
 static int	recovery_min_apply_delay = 0;
 static TimestampTz recoveryDelayUntilTime;
+static int 	restore_command_retry_interval = 5000L;
 
 /* options taken from recovery.conf for XLOG streaming */
 static bool StandbyModeRequested = false;
@@ -4881,6 +4882,28 @@ readRecoveryCommandFile(void)
 	(errmsg_internal(trigger_file = '%s',
 	 TriggerFile)));
 		}
+		else if (strcmp(item-name, restore_command_retry_interval) == 0)
+		{
+			const char *hintmsg;
+
+			if (!parse_int(item-value, restore_command_retry_interval, GUC_UNIT_MS,
+		   hintmsg))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg(parameter \%s\ requires a temporal value,
+restore_command_retry_interval),
+		 hintmsg ? errhint(%s, _(hintmsg)) : 0));
+			ereport(DEBUG2,
+	(errmsg_internal(restore_command_retry_interval = '%s', item-value)));
+
+			if (restore_command_retry_interval = 0)
+			{
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg(\%s\ must be bigger zero,
+	restore_command_retry_interval)));
+			}
+		}
 		else if (strcmp(item-name, recovery_min_apply_delay) == 0)
 		{
 			const char *hintmsg;
@@ -10658,13 +10681,14 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 	}
 
 	/*
-	 * Wait for more WAL to arrive. Time out after 5 seconds,
+	 * Wait for more WAL to arrive. Time out after
+	 * restore_command_retry_interval (5 seconds by default),
 	 * like when polling the archive, to react to a trigger
 	 * file promptly.
 	 */
 	WaitLatch(XLogCtl-recoveryWakeupLatch,
 			  WL_LATCH_SET | WL_TIMEOUT,
-			  5000L);
+			  

Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Bruce Momjian
On Fri, Dec 26, 2014 at 11:52:41PM +0530, Abhijit Menon-Sen wrote:
 At 2014-12-26 13:11:43 -0500, br...@momjian.us wrote:
 
  Is this something that could potentially change the data stored on
  disk? Does pg_upgrade need to check for changes in this area?  Is the
  detection exposed by pg_controldata?  Could this affect running the
  data directory on a different CPU?
 
 No to all.
 
 Subsequent to Heikki's change (already in master) to use the CRC-32C
 algorithm (instead of the earlier mistakenly-reflected-but-not-quite
 one), both the slice-by-8 software implementation posted earlier and
 the SSE4.2 CRC32* instructions will compute exactly the same value.
 
 (Yes, I have verified this in both cases.)

Uh, we didn't fully change all code to use the new algorithm because
there were cases that the CRC was already stored on disk, e.g hstore,
ltree.  I assume you are only linking into Heikki's new code and will
not change the places that use the old CRC method on disk --- just
checking.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2014-12-26 Thread Abhijit Menon-Sen
At 2014-12-26 13:48:40 -0500, br...@momjian.us wrote:

 I assume you are only linking into Heikki's new code and will not
 change the places that use the old CRC method on disk --- just
 checking.

Correct. The legacy CRC computation code used by ltree etc. is
completely unaffected by both my sets of changes.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Nikita Volkov
I'll repost my (OP) case, for the references to it to make more sense to
the others.

Having the following table:

CREATE TABLE song_artist (
  song_id INT8 NOT NULL,
  artist_id INT8 NOT NULL,
  PRIMARY KEY (song_id, artist_id)
);

Even trying to protect from this with a select, won't help to get away from
the error, because at the beginning of the transaction the key does not
exist yet.

BEGIN ISOLATION LEVEL SERIALIZABLE READ WRITE;
  INSERT INTO song_artist (song_id, artist_id)
SELECT 1, 2
  WHERE NOT EXISTS (SELECT * FROM song_artist WHERE song_id=1 AND
artist_id=2);
COMMIT;

2014-12-26 21:38 GMT+03:00 Kevin Grittner kgri...@ymail.com:

 Tom Lane t...@sss.pgh.pa.us wrote:

  Just for starters, a 40XXX error report will fail to provide the
  duplicated key's value.  This will be a functional regression,

 Not if, as is normally the case, the transaction is retried from
 the beginning on a serialization failure.  Either the code will
 check for a duplicate (as in the case of the OP on this thread) and
 they won't see the error, *or* the the transaction which created
 the duplicate key will have committed before the start of the retry
 and you will get the duplicate key error.

  I think an appropriate response to these complaints is to fix the
  documentation to point out that duplicate-key violations may also
  be worthy of retries.

  but I see no mention of the issue in chapter 13.)

 I agree that's the best we can do for stable branches, and worth
 doing.

 It would be interesting to hear from others who have rely on
 serializable transactions in production environments about what
 makes sense to them.  This is probably the wrong list to find such
 people directly; but I seem to recall Josh Berkus has a lot of
 clients who do.  Josh?  Any opinion on this thread?

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Peter Geoghegan
On Fri, Dec 26, 2014 at 7:23 AM, Kevin Grittner kgri...@ymail.com wrote:
 Are there any objections to generating a write conflict instead of
 a duplicate key error if the duplicate key was added by a
 concurrent transaction?  Only for transactions at isolation level
 REPEATABLE READ or higher?

This independently occurred to me as perhaps preferable over a year
ago. The INSERT...ON CONFLICT IGNORE feature that my patch adds will
throw such an error for REPEATABLE READ and higher modes rather than
proceeding on the basis of having ignored something from a concurrent
transaction.

+1 from me for doing this in the master branch, if it isn't too
invasive (I think it might be; where is estate available from to work
with close to duplicate checking for B-Trees?). It only makes sense to
do what you propose if the users expects to check that there is no
duplicate ahead of time, and only ever fail with a serialization
failure (not a duplicate violation) from concurrent conflicts. That is
a bit narrow; the OP can only reasonably expect to not see a duplicate
violation on retry because he happens to be checking...most clients
won't, and will waste a retry.

The proposed ON CONFLICT IGNORE feature would do this checking for him
correctly, FWIW, but in typical cases there is no real point in
retrying the xact -- you'll just get another duplicate violation
error.

In any case I think exclusion violations should be covered, too.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?

2014-12-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Yeah, I've been getting more annoyed by that too lately.  I keep wondering
 though whether there's an actual bug underneath that behavior that we're
 failing to see.

 I think the first thing to do is reconsider usage of PGSTAT_RETRY_DELAY
 instead of PGSTAT_STAT_INTERVAL in autovacuum workers.  That decreases
 the wait time 50-fold, if I recall this correctly, and causes large
 amounts of extra I/O traffic.

Yeah --- that means that instead of the normal behavior that a stats file
newer than 500 msec is good enough, an autovac worker insists on a stats
file newer than 10 msec.  I did some experimentation on prairiedog, and
found that it's not hard at all to see autovac workers demanding multiple
stats writes per second:

2014-12-26 16:26:52.958 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:26:53.128 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:26:53.188 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:26:54.903 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:26:55.058 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:00.022 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:00.285 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:00.792 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:01.010 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:01.163 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:01.193 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:03.595 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:03.673 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:03.839 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:03.878 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:05.878 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:06.571 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:07.001 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:07.769 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:07.950 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:10.256 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:11.039 EST 21026 LOG:  sending inquiry for database 45116
2014-12-26 16:27:11.402 EST 21026 LOG:  sending inquiry for database 45116

The argument that autovac workers need fresher stats than anything else
seems pretty dubious to start with.  Why shouldn't we simplify that down
to they use PGSTAT_STAT_INTERVAL like everybody else?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?

2014-12-26 Thread Alvaro Herrera
Tom Lane wrote:

 The argument that autovac workers need fresher stats than anything else
 seems pretty dubious to start with.  Why shouldn't we simplify that down
 to they use PGSTAT_STAT_INTERVAL like everybody else?

The point of wanting fresher stats than that, eons ago, was to avoid a
worker vacuuming a table that some other worker vacuumed more recently
than PGSTAT_STAT_INTERVAL.  I realize now that the semantics we really
want was something like stats no older than XYZ where the given value
is the timestamp at which we start checking; if we get anything newer
than that it would be okay, but we currently reject it because of lack
of a more appropriate API.  (If it takes more than PGSTAT_STAT_INTERVAL
to get the stats back, a regular backend would ask for fresher stats,
but to an autovac worker they would be good enough as long as they are
newer than its recheck start time.)

Nowadays we can probably disregard the whole issue, since starting a new
vacuum just after the prior one finished should not cause much stress to
the system thanks to the visibility map.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-12-26 Thread Peter Geoghegan
On Fri, Dec 19, 2014 at 5:32 PM, Peter Geoghegan p...@heroku.com wrote:
 Most people would list the columns, but if there is a really bizarre
 constraint, with non-default opclasses, or an exclusion constraint, it's
 probably been given a name that you could use.

 What I find curious about the opclass thing is: when do you ever have
 an opclass that has a different idea of equality than the default
 opclass for the type? In other words, when is B-Tree strategy number 3
 not actually '=' in practice, for *any* B-Tree opclass? Certainly, it
 doesn't appear to be the case that it isn't so with any shipped
 opclasses - the shipped non-default B-Tree opclasses only serve to
 provide alternative notions of sort order, and never equals.

 I think that with B-Tree (which is particularly relevant for the
 UPDATE variant), it ought to be defined to work with the type's
 default opclass equals operator, just like GROUP BY and DISTINCT.
 Non-default opclass unique indexes work just as well in practice,
 unless someone somewhere happens to create an oddball one that doesn't
 use '=' as its equals operator (while also having '=' as the default
 opclass equals operator). I am not aware that that leaves any
 actually shipped opclass out (and I include our external extension
 ecosystem here, although I might be wrong about that part).

So looking at the way the system deals with its dependence on default
operator classes, I have a hard time justifying all this extra
overhead for the common case. The optimizer will refuse to use an
index with a non-default opclass even when AFAICT there is no *real*
semantic dependence on anything other than the equals operator,
which seems to always match across a type's opclasses anyway. e.g.,
DISTINCT will only use a non-default opclass B-Tree index, even though
in practice the equals operator always matches for shipped
non-default opclasses; DISTINCT will not work with a text_pattern_ops
index, while it will work with a default text B-Tree opclass index,
*even though no corresponding ORDER BY was given*.

Someone recently pointed out in a dedicated thread that the system
isn't all that bright about exploiting the fact that group aggregates
don't necessarily need to care about facets of sort ordering like
collations, which have additional overhead [1]. That might be a useful
special case to target (to make underlying sorts faster), but the big
picture is that the system doesn't know when it only needs to care
about an equals operator matching some particular
B-Tree-opclass-defined notion of sorting, rather than caring about a
variety of operators matching. Sometimes, having a matching equals
operator of some non-default opclass is good enough to make an index
(or sort scheme) of that opclass usable for some purpose that only
involves equality, and not sort order (like DISTINCT, with no ORDER
BY, executed using a GroupAggregate, for example).

I thought we should formalize the idea that a non-default opclass must
have the same notion of equality (the same equals operator) as its
corresponding default opclass, if any. That way, presumably the
optimizer has license to be clever about only caring about
DISTINCTness/equality. That also gives my implementation license to
not care about which operator class a unique index uses -- it must not
matter.

Heikki pointed out that there is one shipped opclass that has an
equals operator that happens to not be spelt = [2] (and
furthermore, does not match that of the default opclass). That's the
record_image_ops opclass, which unusually has an equals operator of
*=. So as Heikki pointed out, it looks like there is some limited
precedent for having to worry about B-Tree opclasses that introduce
alternative notions of equals, rather than merely alternative
notions of sort order. So so much for formalizing that all of a type's
B-Tree opclass equals operators must match...

...having thought about it for a while more, though, I think we should
*still* ignore opclass for the purposes of unique index inference. The
implementation doesn't care about the fact that you used a non-default
opclass. Sure, in theory that could lead to inconsistencies, if there
was multiple unique indexes of multiple opclasses that just so
happened to have incompatible ideas about equality, but that seems
ludicrous...we have only one extremely narrow example of how that
could happen. Plus there'd have to be *both* unique indexes defined
and available for us to infer as appropriate, before the inference
logic could accidentally infer the wrong idea of equality. That seems
like an extremely implausible scenario. Even if we allow for the idea
that alternative notions of equality are something that will happen in
the wild, obviously the user cares about the definition of equality
that they actually used for the unique index in question.

We can document that unique index inference doesn't care about
opclasses (recall that I still only plan on letting users infer a
B-Tree unique index), 

Re: [HACKERS] BUG #12330: ACID is broken for unique constraints

2014-12-26 Thread Merlin Moncure
On Fri, Dec 26, 2014 at 12:38 PM, Kevin Grittner kgri...@ymail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 Just for starters, a 40XXX error report will fail to provide the
 duplicated key's value.  This will be a functional regression,

 Not if, as is normally the case, the transaction is retried from
 the beginning on a serialization failure.  Either the code will
 check for a duplicate (as in the case of the OP on this thread) and
 they won't see the error, *or* the the transaction which created
 the duplicate key will have committed before the start of the retry
 and you will get the duplicate key error.

I'm not buying that; that argument assumes duplicate key errors are
always 'upsert' driven.  Although OP's code may have checked for
duplicates it's perfectly reasonable (and in many cases preferable) to
force the transaction to fail and report the error directly back to
the application.  The application will then switch on the error code
and decide what to do: retry for deadlock/serialization or abort for
data integrity error.  IOW, the error handling semantics are
fundamentally different and should not be mixed.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers