Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Benedikt Grundmann
I posted this on this mailing list before at Jane Street we have developed
very fast code to get timing information based on TSC if available.  It's
all ocaml but well documented and mostly just calls to c functions so
should be easy to port to C and we release it under a very liberal license
so it should be no problem to take the ideas:

https://github.com/janestreet/core/blob/master/lib/time_stamp_counter.mli

Hope this is useful.

Bene


On Wed, May 14, 2014 at 12:41 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, May 13, 2014 at 11:34 PM, Greg Stark st...@mit.edu wrote:
  I always assumed the kernel used rdtsc to implement some of the high
  performance timers. It can save the current time in a mapped page when
  it schedules a process and then in the vdso syscall (ie in user-space)
  it can use rdtsc to calculate the offset needed to adjust that
  timestamp to the current time. This seems consistent with your
  calculations that showed the 40ns overhead with +/- 10ns precision.

 Crazy idea: Instead of trying to time precisely the amount of time we
 spend in each node, configure a very-high frequency timer interrupt
 (or background thread?) that does:

 SomeGlobalVariablePointingToTheCurrentNode-profiling_counter++;

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Benedikt Grundmann
On Thu, May 15, 2014 at 8:19 AM, Benedikt Grundmann 
bgrundm...@janestreet.com wrote:

 I posted this on this mailing list before at Jane Street we have developed
 very fast code to get timing information based on TSC if available.  It's
 all ocaml but well documented and mostly just calls to c functions so
 should be easy to port to C and we release it under a very liberal license
 so it should be no problem to take the ideas:

 https://github.com/janestreet/core/blob/master/lib/time_stamp_counter.mli

 Hope this is useful.

 Bene


Also I'm sorry for top posting.  hackers is the only mailing list I'm on
that requires this and some others require top posting so this runs counter
my habits and I only realized after sending...




 On Wed, May 14, 2014 at 12:41 PM, Robert Haas robertmh...@gmail.comwrote:

 On Tue, May 13, 2014 at 11:34 PM, Greg Stark st...@mit.edu wrote:
  I always assumed the kernel used rdtsc to implement some of the high
  performance timers. It can save the current time in a mapped page when
  it schedules a process and then in the vdso syscall (ie in user-space)
  it can use rdtsc to calculate the offset needed to adjust that
  timestamp to the current time. This seems consistent with your
  calculations that showed the 40ns overhead with +/- 10ns precision.

 Crazy idea: Instead of trying to time precisely the amount of time we
 spend in each node, configure a very-high frequency timer interrupt
 (or background thread?) that does:

 SomeGlobalVariablePointingToTheCurrentNode-profiling_counter++;

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers





[HACKERS] Selectivity estimation for inet operators

2014-05-15 Thread Emre Hasegeli
New version of the selectivity estimation patch attached. I am adding
it to CommitFest 2014-06. Previous version of it reviewed by
Andreas Karlson on the previous CommitFest with the GiST support patch.
The new version includes join selectivity estimation.

Join selectivity is calculated in 4 steps:

* matching first MCV to second MCV
* searching first MCV in the second histogram
* searching second MCV in the first histogram
* searching boundaries of the first histogram in the second histogram

Comparing the lists with each other slows down the function when
statistics set to higher values. To avoid this problem I only use
log(n) values of the lists. It is the first log(n) value for MCV,
evenly separated values for histograms. In my tests, this optimization
does not affect the planning time when statistics = 100, but does
affect accuracy of the estimation. I can send the version without
this optimization, if slow down with larger statistics is not a problem
which should be solved on the selectivity estimation function.

I also attach the script I was using for testing and I left log statements
in the networkjoinsel() function to make testing easier. These statements
should be removed before commit.


inet-selfuncs-v4.patch
Description: Binary data


inet-selfuncs-test.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error in running DBT2

2014-05-15 Thread Rohit Goyal
Hi All,

I am runnig dbt2 with last postgresql kit. pg9.4. I tried everything again
after setting up awhole new machine again with ubuntu. Still facing the
same error.


I run the the *dbt2-pgsql-build-db -w 1 *

but, after some time, I faced this error

*/home/abhi/dbt2_install/bin/dbt2-pgsql-load-stored-procs: 45: [: c:
unexpected operator*
*/home/abhi/dbt2_install/bin/dbt2-pgsql-load-stored-procs: 53: [: c:
unexpected operator*
*unknown stored function type: c*

and script ends!!

Please guide me probable solution to get over it.

Regards,
Rohit
On Wed, May 14, 2014 at 12:43 AM, Rohit Goyal rhtgyl...@gmail.com wrote:

 Hi Peter,

 I tried the solution suggested by you. Please problem still persists.

 I run the the *dbt2-pgsql-build-db -w 1 *

 but, after some time, I faced this error

 */home/abhi/dbt2_install/bin/dbt2-pgsql-load-stored-procs: 45: [: c:
 unexpected operator*
 */home/abhi/dbt2_install/bin/dbt2-pgsql-load-stored-procs: 53: [: c:
 unexpected operator*
 *unknown stored function type: c*

 and script ends!!

 moreover, I see that DB has been created and also 9 tables are there in
 Database dbt2. Please suggest how to proceed.

 Regards,
 rohit Goyal




 On Tue, May 13, 2014 at 9:44 PM, Peter Geoghegan p...@heroku.com wrote:


 On Tue, May 13, 2014 at 12:36 PM, Rohit Goyal rhtgyl...@gmail.comwrote:

 This pattern the above found many times. Please guide me through!!!


 IIRC, people have been working around this by setting
 standard_conforming_strings to off. It really ought to be fixed in a
 principled way, though -- the real issue here is that dbt2 has severe
 bit-rot.

 --
 Peter Geoghegan




 --
 Regards,
 Rohit Goyal




-- 
Regards,
Rohit Goyal


[HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

2014-05-15 Thread Heikki Linnakangas

On 05/14/2014 06:06 PM, Noah Misch wrote:

On Wed, May 14, 2014 at 05:51:24PM +0300, Heikki Linnakangas wrote:

On 05/14/2014 05:37 PM, Noah Misch wrote:

On Wed, May 14, 2014 at 03:15:38PM +0300, Heikki Linnakangas wrote:

On 05/09/2014 02:56 AM, Noah Misch wrote:

MinGW: 
http://sourceforge.net/p/mingw/mingw-org-wsl/ci/master/tree/include/stdio.h#l467
MinGW-w64: 
http://sourceforge.net/p/mingw-w64/code/HEAD/tree/trunk/mingw-w64-headers/crt/stdio.h#l496

Building with any recent MinGW-w64, 32-bit or 64-bit, gets the reported
warnings; building with MinGW proper does not.


Hmm. The MinGW-w64 header does this:


#if !defined(NO_OLDNAMES)  !defined(popen)
#define popen _popen
#define pclose _pclose
#endif


So if we defined popen() before including stdio.h, that would get
rid of the warning. But we don't usually do things in that order.


True.  I have no strong preference between that and use of #undef.


I think I would prefer #undef. The risk with that is if some
platform has #defined popen() to something else entirely, for some
good reason, we would be bypassing that hypothetical wrapper. But I
guess we'll cross that bridge if we get there.


Works for me.  Since (popen)(x, y) shall behave the same as popen(x, y),
such a hypothetical system header would be buggy, anyway.


Ok, I committed #undefs. I don't have a Mingw(-w64) environment to test 
with, so let's see if the buildfarm likes it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Greg Stark
On Thu, May 15, 2014 at 8:19 AM, Benedikt Grundmann
bgrundm...@janestreet.com wrote:
 I posted this on this mailing list before at Jane Street we have developed
 very fast code to get timing information based on TSC if available.  It's
 all ocaml but well documented and mostly just calls to c functions so should
 be easy to port to C and we release it under a very liberal license so it
 should be no problem to take the ideas:

What OS do you run it on though? How fast is your implementation
compared to the kernel implementation of clock_gettime()?

Are you sure your implementation is actually faster? And are you sure
you're protected against clocks going backwards? I think you should
put some i/o in the loop in the test and start several threads running
it to make it more likely the thread is rescheduled to a different
processor during the test. It suspect you'll find the rdtsc goes
backwards sometimes or produces crazy results when switching
processors.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Benedikt Grundmann
On Thu, May 15, 2014 at 11:31 AM, Greg Stark st...@mit.edu wrote:

 On Thu, May 15, 2014 at 8:19 AM, Benedikt Grundmann
 bgrundm...@janestreet.com wrote:
  I posted this on this mailing list before at Jane Street we have
 developed
  very fast code to get timing information based on TSC if available.  It's
  all ocaml but well documented and mostly just calls to c functions so
 should
  be easy to port to C and we release it under a very liberal license so it
  should be no problem to take the ideas:

 What OS do you run it on though? How fast is your implementation
 compared to the kernel implementation of clock_gettime()?

 Are you sure your implementation is actually faster? And are you sure
 you're protected against clocks going backwards? I think you should
 put some i/o in the loop in the test and start several threads running
 it to make it more likely the thread is rescheduled to a different
 processor during the test. It suspect you'll find the rdtsc goes
 backwards sometimes or produces crazy results when switching
 processors.


There are benchmarks in the link I posted (obtained by a micro benchmarking
library we developed / use internally which takes great care to obtain
reliable numbers) .  We use posix threads extensively. We internally spend
a lot of time setting up ntp and monitoring systems so that clock backwards
never happens (so with other words I wouldn't be surprised if the library
does NOT work correctly when it does  -- our protection is outside).  I do
not believe we have seen the tdtsc going backwards on thread context switch
you mention (and as said we use lots of threads).  OS?  Centos 6.5
primarily.







--
 greg



Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Andres Freund
On 2014-05-15 12:04:25 +0100, Benedikt Grundmann wrote:
 On Thu, May 15, 2014 at 11:31 AM, Greg Stark st...@mit.edu wrote:
 
  On Thu, May 15, 2014 at 8:19 AM, Benedikt Grundmann
  bgrundm...@janestreet.com wrote:
   I posted this on this mailing list before at Jane Street we have
  developed
   very fast code to get timing information based on TSC if available.  It's
   all ocaml but well documented and mostly just calls to c functions so
  should
   be easy to port to C and we release it under a very liberal license so it
   should be no problem to take the ideas:
 
  What OS do you run it on though? How fast is your implementation
  compared to the kernel implementation of clock_gettime()?
 
  Are you sure your implementation is actually faster? And are you sure
  you're protected against clocks going backwards? I think you should
  put some i/o in the loop in the test and start several threads running
  it to make it more likely the thread is rescheduled to a different
  processor during the test. It suspect you'll find the rdtsc goes
  backwards sometimes or produces crazy results when switching
  processors.
 
 
 There are benchmarks in the link I posted (obtained by a micro benchmarking
 library we developed / use internally which takes great care to obtain
 reliable numbers) .  We use posix threads extensively. We internally spend
 a lot of time setting up ntp and monitoring systems so that clock backwards
 never happens (so with other words I wouldn't be surprised if the library
 does NOT work correctly when it does  -- our protection is outside).  I do
 not believe we have seen the tdtsc going backwards on thread context switch
 you mention (and as said we use lots of threads).  OS?  Centos 6.5
 primarily.

Did you test it on server with more one socket (i.e. not just multiple
cores, but distinct cpu cases)? That's where you expect to see
differences in TSC to have funny effects.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Andres Freund
Hi,

On 2014-05-13 18:58:11 -0400, Tom Lane wrote:
 Anyway it looks like clock_gettime() might be worth using on Linux
 just for the more precise output.  It doesn't seem to exist on OS X
 though, and I have no idea about elsewhere.

Agreed that using clock_gettime() would be a good idea. I'd say we
should have a wrapper around it that is able to provide nanosecond
precision. If only gettimeofday() (and whatever windows is using) is
available, we can dynamically fall back to that.

 I'm curious if anybody has ideas about other things we might do for
 portable high-precision timing.

It's far from a solve-it-all, but can we perhaps try to coalesce
repeated time measurements? We'll very frequently do a
InstrStopNode();
/* minimal amount of work */
InstrStartNode();
which will measure the time twice. I think there's a fair number of
scenarios where once would be enough. I'll freely admit that I haven't
looked enough to determine how we could do that API wise.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Various cosmetic fixes

2014-05-15 Thread Heikki Linnakangas

On 05/14/2014 08:49 PM, Euler Taveira wrote:

While updating pt-br translation I noticed that some sentences could be
improved. I also fix some style glitches. A set of patches are attached.


Thanks, applied.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Bruce Momjian
On Tue, May  6, 2014 at 11:15:17PM +0100, Simon Riggs wrote:
  Well, for what it's worth, I've encountered systems where setting
  effective_cache_size too low resulted in bad query plans, but I've
  never encountered the reverse situation.
 
 I agree with that.
 
 Though that misses my point, which is that you can't know that all of
 that memory is truly available on a server with many concurrent users.
 Choosing settings that undercost memory intensive plans are not the
 best choice for a default strategy in a mixed workload when cache may
 be better used elsewhere, even if such settings make sense for some
 individual users.

This is the same problem we had with auto-tuning work_mem, in that we
didn't know what other concurrent activity was happening.  Seems we need
concurrent activity detection before auto-tuning work_mem and
effective_cache_size.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] popen and pclose redefinitions causing many warning in Windows build

2014-05-15 Thread Michael Paquier
On Thu, May 15, 2014 at 6:20 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Ok, I committed #undefs. I don't have a Mingw(-w64) environment to test
 with, so let's see if the buildfarm likes it.
There does not seem to be a buildfarm machine using MinGW-w64... Btw,
I tested latest master on a Windows box and MinGW-w64 is happier now.
Thanks!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Amit Langote
On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian br...@momjian.us wrote:

 This is the same problem we had with auto-tuning work_mem, in that we
 didn't know what other concurrent activity was happening.  Seems we need
 concurrent activity detection before auto-tuning work_mem and
 effective_cache_size.


Perhaps I am missing something obvious here, but would mmgr have any
useful numbers on this? Like any book-keeping info maintained by
mcxt.c/aset.c? Would extending that interface help?

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

2014-05-15 Thread Heikki Linnakangas

On 05/15/2014 04:15 PM, Michael Paquier wrote:

On Thu, May 15, 2014 at 6:20 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Ok, I committed #undefs. I don't have a Mingw(-w64) environment to test
with, so let's see if the buildfarm likes it.

There does not seem to be a buildfarm machine using MinGW-w64...


Jacana. It has gcc 4.8.1 listed as the compiler, but if you look at 
the config in detail, it's mingw-w64. The popen/pclose warnings are 
there. It hasn't performed a build after I committed the fix yet.



Btw, I tested latest master on a Windows box and MinGW-w64 is happier now.


Thanks!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: [HACKERS] Problem with txid_snapshot_in/out() functionality)

2014-05-15 Thread Heikki Linnakangas

On 05/06/2014 02:44 PM, Andres Freund wrote:

On 2014-05-05 13:41:00 +0300, Heikki Linnakangas wrote:

+/*
+ * Exit hook to unlock the global transaction entry we're working on.
+ */
+static void
+AtProcExit_Twophase(int code, Datum arg)
+{
+   /* same logic as abort */
+   AtAbort_Twophase();
+}
+
+/*
+ * Abort hook to unlock the global transaction entry we're working on.
+ */
+void
+AtAbort_Twophase(void)
+{
+   if (MyLockedGxact == NULL)
+   return;
+
+   /*
+* If we were in process of preparing the transaction, but haven't
+* written the WAL record yet, remove the global transaction entry.
+* Same if we are in the process of finishing an already-prepared
+* transaction, and fail after having already written the WAL 2nd
+* phase commit or rollback record.
+*
+* After that it's too late to abort, so just unlock the 
GlobalTransaction
+* entry.  We might not have transfered all locks and other state to the
+* prepared transaction yet, so this is a bit bogus, but it's the best 
we
+* can do.
+*/
+   if (!MyLockedGxact-valid)
+   {
+   RemoveGXact(MyLockedGxact);
+   }
+   else
+   {
+   LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+
+   MyLockedGxact-locking_backend = InvalidBackendId;
+
+   LWLockRelease(TwoPhaseStateLock);
+   }
+   MyLockedGxact = NULL;
+}


Is it guaranteed that all paths have called LWLockReleaseAll()
before calling the proc exit hooks? Otherwise we might end up waiting
for ourselves...


Hmm. AbortTransaction() will release locks before we get here, but the 
before_shmem_exit() callpath will not. So an elog(FATAL), while holding 
TwoPhaseStateLock would cause us to deadlock with ourself. But there are 
no such elogs.


I copied this design from async.c, which is quite similar, so if there's 
a problem that ought to be fixed too. And there are other more 
complicated before_shmem callbacks that worry me more, like 
createdb_failure_callback(). But I think they're all all right.



  /*
   * MarkAsPreparing
@@ -261,29 +329,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
 errmsg(prepared transactions are disabled),
  errhint(Set max_prepared_transactions to a nonzero 
value.)));

-   LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
-   /*
-* First, find and recycle any gxacts that failed during prepare. We do
-* this partly to ensure we don't mistakenly say their GIDs are still
-* reserved, and partly so we don't fail on out-of-slots unnecessarily.
-*/
-   for (i = 0; i  TwoPhaseState-numPrepXacts; i++)
+   /* on first call, register the exit hook */
+   if (!twophaseExitRegistered)
{
-   gxact = TwoPhaseState-prepXacts[i];
-   if (!gxact-valid  !TransactionIdIsActive(gxact-locking_xid))
-   {
-   /* It's dead Jim ... remove from the active array */
-   TwoPhaseState-numPrepXacts--;
-   TwoPhaseState-prepXacts[i] = 
TwoPhaseState-prepXacts[TwoPhaseState-numPrepXacts];
-   /* and put it back in the freelist */
-   gxact-next = TwoPhaseState-freeGXacts;
-   TwoPhaseState-freeGXacts = gxact;
-   /* Back up index count too, so we don't miss scanning 
one */
-   i--;
-   }
+   before_shmem_exit(AtProcExit_Twophase, 0);
+   twophaseExitRegistered = true;
}


It's not particularly nice to register shmem exit hooks in the middle of
normal processing because it makes it impossible to use
cancel_before_shmem_exit() previously registered hooks. I think this
should be registered at startup, if max_prepared_xacts  0.


shrug. async.c and namespace.c does the same, and it hasn't been a 
problem.


I committed this now, but please let me know if you see a concrete 
problem with the locks.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Bruce Momjian
On Thu, May 15, 2014 at 10:23:19PM +0900, Amit Langote wrote:
 On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian br...@momjian.us wrote:
 
  This is the same problem we had with auto-tuning work_mem, in that we
  didn't know what other concurrent activity was happening.  Seems we need
  concurrent activity detection before auto-tuning work_mem and
  effective_cache_size.
 
 
 Perhaps I am missing something obvious here, but would mmgr have any
 useful numbers on this? Like any book-keeping info maintained by
 mcxt.c/aset.c? Would extending that interface help?

No, all memory allocat is per-process, except for shared memory.  We
probably need a way to record our large local memory allocations in
PGPROC that other backends can see;  same for effective cache size
assumptions we make.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Amit Langote
On Thu, May 15, 2014 at 11:24 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, May 15, 2014 at 10:23:19PM +0900, Amit Langote wrote:
 On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian br...@momjian.us wrote:
 
  This is the same problem we had with auto-tuning work_mem, in that we
  didn't know what other concurrent activity was happening.  Seems we need
  concurrent activity detection before auto-tuning work_mem and
  effective_cache_size.
 

 Perhaps I am missing something obvious here, but would mmgr have any
 useful numbers on this? Like any book-keeping info maintained by
 mcxt.c/aset.c? Would extending that interface help?

 No, all memory allocat is per-process, except for shared memory.  We
 probably need a way to record our large local memory allocations in
 PGPROC that other backends can see;  same for effective cache size
 assumptions we make.


I see. I thought there would be some centralised way to traverse, say,
a linked list of contexts that individual backends create or something
like that. But, I suppose it would not be straightforward to make any
of that work for what we are after here.

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: [HACKERS] Problem with txid_snapshot_in/out() functionality)

2014-05-15 Thread Andres Freund
On 2014-05-15 17:21:28 +0300, Heikki Linnakangas wrote:
 Is it guaranteed that all paths have called LWLockReleaseAll()
 before calling the proc exit hooks? Otherwise we might end up waiting
 for ourselves...
 
 Hmm. AbortTransaction() will release locks before we get here, but the
 before_shmem_exit() callpath will not. So an elog(FATAL), while holding
 TwoPhaseStateLock would cause us to deadlock with ourself. But there are no
 such elogs.

 I copied this design from async.c, which is quite similar, so if there's a
 problem that ought to be fixed too. And there are other more complicated
 before_shmem callbacks that worry me more, like createdb_failure_callback().
 But I think they're all all right.

Perhaps we should enforce that LWLockReleaseAll() is called first?
E.g. in shmem_exit()? It'll happen in ProcKill() atm, but that's
normally pretty much at the bottom of the stack.

 It's not particularly nice to register shmem exit hooks in the middle of
 normal processing because it makes it impossible to use
 cancel_before_shmem_exit() previously registered hooks. I think this
 should be registered at startup, if max_prepared_xacts  0.
 
 shrug. async.c and namespace.c does the same, and it hasn't been a
 problem.

Well, it doesn't seem unreasonable to have C code using
PG_ENSURE_ERROR_CLEANUP/PG_END_ENSURE_ERROR_CLEANUP around a 2pc commit
to me. That'll break with this.
Perhaps we should just finally make cancel_before_shmem_exit search the
stack of callbacks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Unportability of setvbuf()

2014-05-15 Thread Tom Lane
From the department of punishing good deeds ... in commit 2dc4f011fd
I added setvbuf() calls to initdb to ensure that output to stdout and
stderr would appear in a consistent order regardless of whether the
output was going to a terminal or a file.  The buildfarm shows that
on several (but not all) Windows machines, initdb is now failing
without printing anything.  After a bit of research, I believe what
is happening is:

(1) On Windows, setvbuf interprets _IOLBF as _IOFBF.
(2) If _IOFBF is specified and size is zero, it reports EINVAL.
(3) If parameter validation is enabled, EINVAL turns into abort().

Thanks Microsoft for your careful attention to compliance with POSIX.

I see that syslogger.c encountered this problem long ago and solved
it by the expedient of using _IONBF not _IOLBF for the log output
file.  That's probably what we must do for stdout in initdb as well.
It seems likely that we may need the same in other client programs
someday.

What I'm not totally sure about is how to wrap this up nicely.
I'm inclined first of all to move syslogger.c's LBF_MODE symbol to
port.h and rename it to, say, PG_IOLBF.

It might also be reasonable to create a wrapper macro along the line of
PG_STD_IO_BUFFERING() that would encapsulate the whole sequence
setvbuf(stdout, NULL, _IOLBF, 0);
setvbuf(stderr, NULL, _IONBF, 0);
Or maybe we should have separate macros for those two calls.  Or maybe
this is just a useless layer of abstraction and PG_IOLBF is enough
to make the calls portable.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with txid_snapshot_in/out() functionality

2014-05-15 Thread Heikki Linnakangas

On 04/14/2014 11:55 AM, Marko Kreen wrote:

On Sun, Apr 13, 2014 at 05:46:20PM -0400, Jan Wieck wrote:

On 04/13/14 14:22, Jan Wieck wrote:

On 04/13/14 08:27, Marko Kreen wrote:

I think you need to do SET_VARSIZE also here.  Alternative is to
move SET_VARSIZE after sort_snapshot().

And it seems the drop-double-txid logic should be added also to
txid_snapshot_recv().  It seems weird to have it behave differently

from txid_snapshot_in().




Thanks,

yes on both issues. Will create another patch.


New patch attached.

New github commit is 
https://github.com/wieck/postgres/commit/b8fd0d2eb78791e5171e34aecd233fd06218890d


Looks OK to me.


Ok, committed.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unportability of setvbuf()

2014-05-15 Thread Alvaro Herrera
Tom Lane wrote:

 It might also be reasonable to create a wrapper macro along the line of
 PG_STD_IO_BUFFERING() that would encapsulate the whole sequence
   setvbuf(stdout, NULL, _IOLBF, 0);
   setvbuf(stderr, NULL, _IONBF, 0);
 Or maybe we should have separate macros for those two calls.  Or maybe
 this is just a useless layer of abstraction and PG_IOLBF is enough
 to make the calls portable.
 
 Thoughts?

I don't really know all that much about this stuff, but see commits
6eda3e9c27781dec369542a9b20cba7c3d832a5e and its parent about
isolationtester.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Bruce Momjian
On Thu, May 15, 2014 at 11:36:51PM +0900, Amit Langote wrote:
  No, all memory allocat is per-process, except for shared memory.  We
  probably need a way to record our large local memory allocations in
  PGPROC that other backends can see;  same for effective cache size
  assumptions we make.
 
 
 I see. I thought there would be some centralised way to traverse, say,
 a linked list of contexts that individual backends create or something
 like that. But, I suppose it would not be straightforward to make any
 of that work for what we are after here.

The problem is locking overhead between sessions.  Right now we avoid
all of that, and I think if we just put the value in PGPROC, it will be
good enough with limited locking required.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unportability of setvbuf()

2014-05-15 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 It might also be reasonable to create a wrapper macro along the line of
 PG_STD_IO_BUFFERING() that would encapsulate the whole sequence
 setvbuf(stdout, NULL, _IOLBF, 0);
 setvbuf(stderr, NULL, _IONBF, 0);
 Or maybe we should have separate macros for those two calls.  Or maybe
 this is just a useless layer of abstraction and PG_IOLBF is enough
 to make the calls portable.
 
 Thoughts?

 I don't really know all that much about this stuff, but see commits
 6eda3e9c27781dec369542a9b20cba7c3d832a5e and its parent about
 isolationtester.

Yeah, making them both unbuffered is another scenario that has its
use-cases, so maybe it's inappropriate to create a macro that presumes
to define the One True Way.

For the moment I'll just arrange for initdb to share the logic with
syslogger.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Tomas Vondra
Hi all,

today I got a few of errors like these (this one is from last week, though):

   Status Line: 493 snapshot too old: Wed May  7 04:36:57 2014 GMT
   Content:
   snapshot to old: Wed May  7 04:36:57 2014 GMT

on the new buildfarm animals. I believe it was my mistake (incorrectly
configured local git mirror), but it got me thinking about how this will
behave with the animals running CLOBBER_CACHE_RECURSIVELY.

If I understand the Perl code correctly, it does this:

(1) update the repository
(2) run the tests
(3) check that the snapshot is not older than 24 hours (pgstatus.pl:188)
(4) fail if older

Now, imagine that the test runs for days/weeks. This pretty much means
it's wasted, because the results will be thrown away anyway, no?

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Logical replication woes

2014-05-15 Thread Heikki Linnakangas

Spotted while testing pg_recvlogical:

1. Set up pg_recvlogical to receive:

./pg_recvlogical -S fooslot -d postgres --create
./pg_recvlogical -S fooslot -d postgres --start -f -

2. In another terminal, with psql:

create table foo (id int4);
begin;
 insert into foo values (4);
 alter table foo alter column id type text;
prepare transaction 'foo';
commit prepared 'foo';
insert into foo values (1);

3.  With current HEAD, after commit 
bb38fb0d43c8d7ff54072bfd8bd63154e536b384, this produces an assertion 
failure:


TRAP: FailedAssertion(!(((xid) != ((TransactionId) 0))), File: 
reorderbuffer.c, Line: 508)


I believe that's we no longer assign another XID to the transaction that 
does the COMMIT PREPARED. Previously, an extra XID, in addition to the 
XID of the prepared transaction, was assigned for use in locking the 
global transaction entry in shared memory, but that's no longer required.


However, even with that patch reverted, it doesn't work correctly:

ERROR:  could not map filenode base/12142/16390 to relation OID
LOG:  starting logical decoding for slot fooslot
DETAIL:  streaming transactions committing after 0/16D1670, reading WAL 
from 0/16BC470


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_recvlogical, stdout and SIGHUP

2014-05-15 Thread Andres Freund
On 2014-05-13 17:43:47 +0300, Heikki Linnakangas wrote:
 On 05/13/2014 04:35 PM, Andres Freund wrote:
 On 2014-05-13 16:31:25 +0300, Heikki Linnakangas wrote:
 Another thing I noticed is that if when the output goes to a file, the file
 isn't re-opened immediately on SIGHUP. Only after receiving some data from
 the server. I believe that's also not intentional.
 
 Hm. I can't really get excited about that one. Not doing that seems to
 complicate matters unneccessarily. What's the problem here?
 
 Not sure if it matters in any real-world scenario, but I found it pretty
 surprising while playing with it. It should be trivial to fix; ISTM the
 problem is that there is a continue in the loop when select() is
 interrupted by signal, but the re-opening is done after the select() in the
 loop. I think all you need to do is move the check for output_reopen to the
 beginning of the loop.

Thanks for fixing and sorry for being slow :(.

Any reason you didn't also move the opening of the output file up? It
seems a bit odd to not have an output file existing every now and
then...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Heikki Linnakangas

On 05/15/2014 07:57 PM, Heikki Linnakangas wrote:

Spotted while testing pg_recvlogical:

1. Set up pg_recvlogical to receive:

./pg_recvlogical -S fooslot -d postgres --create
./pg_recvlogical -S fooslot -d postgres --start -f -

2. In another terminal, with psql:

create table foo (id int4);
begin;
   insert into foo values (4);
   alter table foo alter column id type text;
prepare transaction 'foo';
commit prepared 'foo';
insert into foo values (1);

3.  With current HEAD, after commit
bb38fb0d43c8d7ff54072bfd8bd63154e536b384, this produces an assertion
failure:

TRAP: FailedAssertion(!(((xid) != ((TransactionId) 0))), File:
reorderbuffer.c, Line: 508)

I believe that's we no longer assign another XID to the transaction that
does the COMMIT PREPARED. Previously, an extra XID, in addition to the
XID of the prepared transaction, was assigned for use in locking the
global transaction entry in shared memory, but that's no longer required.

However, even with that patch reverted, it doesn't work correctly:

ERROR:  could not map filenode base/12142/16390 to relation OID
LOG:  starting logical decoding for slot fooslot
DETAIL:  streaming transactions committing after 0/16D1670, reading WAL
from 0/16BC470


Ok, so the immediate cause was quick to find: when decoding a 
commit-prepared WAL record, we have to use the XID from the record 
content (patch attached). The XID in the record header is the XID of the 
transaction doing the COMMIT PREPARED, which is always 0 after patch 
bb38fb0d43c8d7ff54072bfd8bd63154e536b384 which causes the assertion. But 
it was always wrong. After fixing, it no longer asserts or gives the 
above could not map filenode error.


However, it still doesn't seem right. When I do the above as a regular 
transaction, ie:


begin; insert into foo values (6);  alter table foo alter column id type 
text; commit;


pg_recvlogical prints this:

BEGIN 708
table public.foo: INSERT: id[text]:'6'
COMMIT 708

But if I do it as a prepared transaction:

begin; insert into foo values (7);  alter table foo alter column id type 
text; prepare transaction 'foo'; commit prepared 'foo';


pg_recvlogical prints nothing.

- Heikki
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d6499d5..cc73652 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -225,7 +225,7 @@ DecodeXactOp(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 subxacts = (TransactionId *) (xlrec-xnodes[xlrec-nrels]);
 invals = (SharedInvalidationMessage *) (subxacts[xlrec-nsubxacts]);
 
-DecodeCommit(ctx, buf, prec-xl_xid, xlrec-dbId,
+DecodeCommit(ctx, buf, prec-xid, xlrec-dbId,
 			 xlrec-xact_time,
 			 xlrec-nsubxacts, subxacts,
 			 xlrec-nmsgs, invals);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_recvlogical, stdout and SIGHUP

2014-05-15 Thread Heikki Linnakangas

On 05/15/2014 07:59 PM, Andres Freund wrote:

On 2014-05-13 17:43:47 +0300, Heikki Linnakangas wrote:

On 05/13/2014 04:35 PM, Andres Freund wrote:

On 2014-05-13 16:31:25 +0300, Heikki Linnakangas wrote:

Another thing I noticed is that if when the output goes to a file, the file
isn't re-opened immediately on SIGHUP. Only after receiving some data from
the server. I believe that's also not intentional.


Hm. I can't really get excited about that one. Not doing that seems to
complicate matters unneccessarily. What's the problem here?


Not sure if it matters in any real-world scenario, but I found it pretty
surprising while playing with it. It should be trivial to fix; ISTM the
problem is that there is a continue in the loop when select() is
interrupted by signal, but the re-opening is done after the select() in the
loop. I think all you need to do is move the check for output_reopen to the
beginning of the loop.


Thanks for fixing and sorry for being slow :(.

Any reason you didn't also move the opening of the output file up? It
seems a bit odd to not have an output file existing every now and
then...


No particular reason. But that would actually be a great idea, because 
currently you won't get any error you give pg_recvlogical an invalid 
path, until it receives the first piece of data from the server and 
tries to write it to the file. I'll go and do that.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Andres Freund
Hi,

On 2014-05-15 20:07:23 +0300, Heikki Linnakangas wrote:
 On 05/15/2014 07:57 PM, Heikki Linnakangas wrote:
 Spotted while testing pg_recvlogical:
 
 1. Set up pg_recvlogical to receive:
 
 ./pg_recvlogical -S fooslot -d postgres --create
 ./pg_recvlogical -S fooslot -d postgres --start -f -
 
 2. In another terminal, with psql:
 
 create table foo (id int4);
 begin;
insert into foo values (4);
alter table foo alter column id type text;
 prepare transaction 'foo';
 commit prepared 'foo';
 insert into foo values (1);
 
 3.  With current HEAD, after commit
 bb38fb0d43c8d7ff54072bfd8bd63154e536b384, this produces an assertion
 failure:
 
 TRAP: FailedAssertion(!(((xid) != ((TransactionId) 0))), File:
 reorderbuffer.c, Line: 508)
 
 I believe that's we no longer assign another XID to the transaction that
 does the COMMIT PREPARED. Previously, an extra XID, in addition to the
 XID of the prepared transaction, was assigned for use in locking the
 global transaction entry in shared memory, but that's no longer required.
 
 However, even with that patch reverted, it doesn't work correctly:
 
 ERROR:  could not map filenode base/12142/16390 to relation OID
 LOG:  starting logical decoding for slot fooslot
 DETAIL:  streaming transactions committing after 0/16D1670, reading WAL
 from 0/16BC470
 
 Ok, so the immediate cause was quick to find: when decoding a
 commit-prepared WAL record, we have to use the XID from the record content
 (patch attached). The XID in the record header is the XID of the transaction
 doing the COMMIT PREPARED, which is always 0 after patch
 bb38fb0d43c8d7ff54072bfd8bd63154e536b384 which causes the assertion. But it
 was always wrong. After fixing, it no longer asserts or gives the above
 could not map filenode error.
 
 However, it still doesn't seem right. When I do the above as a regular
 transaction, ie:
 
 begin; insert into foo values (6);  alter table foo alter column id type
 text; commit;
 
 pg_recvlogical prints this:
 
 BEGIN 708
 table public.foo: INSERT: id[text]:'6'
 COMMIT 708
 
 But if I do it as a prepared transaction:
 
 begin; insert into foo values (7);  alter table foo alter column id type
 text; prepare transaction 'foo'; commit prepared 'foo';

Looking into it. I at some point dropped the prepared xact tests and
that was obviously a mistake. Will re-add them and fix.

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Andrew Dunstan


On 05/15/2014 12:43 PM, Tomas Vondra wrote:

Hi all,

today I got a few of errors like these (this one is from last week, though):

Status Line: 493 snapshot too old: Wed May  7 04:36:57 2014 GMT
Content:
snapshot to old: Wed May  7 04:36:57 2014 GMT

on the new buildfarm animals. I believe it was my mistake (incorrectly
configured local git mirror), but it got me thinking about how this will
behave with the animals running CLOBBER_CACHE_RECURSIVELY.

If I understand the Perl code correctly, it does this:

(1) update the repository
(2) run the tests
(3) check that the snapshot is not older than 24 hours (pgstatus.pl:188)
(4) fail if older

Now, imagine that the test runs for days/weeks. This pretty much means
it's wasted, because the results will be thrown away anyway, no?




The 24 hours runs from the time of the latest commit on the branch in 
question, not the current time, but basically yes.


We've never had machines with runs that long. The longest in recent 
times has been friarbird, which runs CLOBBER_CACHE_ALWAYS and takes 
around 4.5 hours. But we have had misconfigured machines reporting 
unbelievable snapshot times.  I'll take a look and see if we can tighten 
up the sanity check. It's worth noting that one thing friarbird does is 
skip the install-check stage - it's almost certainly not going to have 
terribly much interesting to tell us from that, given it has already run 
a plain make check.


How long does a CLOBBER_CACHE_RECURSIVELY run take? days or weeks seems 
kinda nuts.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Andres Freund
On 2014-05-15 20:07:23 +0300, Heikki Linnakangas wrote:
 Ok, so the immediate cause was quick to find: when decoding a
 commit-prepared WAL record, we have to use the XID from the record content
 (patch attached). The XID in the record header is the XID of the transaction
 doing the COMMIT PREPARED, which is always 0 after patch
 bb38fb0d43c8d7ff54072bfd8bd63154e536b384 which causes the assertion. But it
 was always wrong. After fixing, it no longer asserts or gives the above
 could not map filenode error.
 
 However, it still doesn't seem right. When I do the above as a regular
 transaction, ie:
 
 begin; insert into foo values (6);  alter table foo alter column id type
 text; commit;
 
 pg_recvlogical prints this:
 
 BEGIN 708
 table public.foo: INSERT: id[text]:'6'
 COMMIT 708
 
 But if I do it as a prepared transaction:
 
 begin; insert into foo values (7);  alter table foo alter column id type
 text; prepare transaction 'foo'; commit prepared 'foo';

How very wierd. The reason for this is that
RecordTransactionCommitPrepared() forgets to fill a couple of fields in
xl_xact_commit. Any reason dbId/tsId aren't filled? They aren't strictly
needed because afaics they're only looked at for relcache invalidations
which prepared xacts don't support, but still?
That also explains why decoding for prepared xacts (besides the typo
you found) didn't work anymore - the filtering at commit was added
pretty late...

Attached patch fixes things, but I want to add some regression tests
before commit.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Andres Freund
On 2014-05-15 19:46:57 +0200, Andres Freund wrote:
 Attached patch fixes things, but I want to add some regression tests
 before commit.

And now actually attached. Will send a patch with regression tests later
tonight or tomorrow. Need to eat first...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 2cefa08..5d8e195 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2082,9 +2082,10 @@ RecordTransactionCommitPrepared(TransactionId xid,
 
 	/* Emit the XLOG commit record */
 	xlrec.xid = xid;
+	xlrec.crec.dbId = MyDatabaseId;
+	xlrec.crec.tsId = MyDatabaseTableSpace;
 	xlrec.crec.xact_time = GetCurrentTimestamp();
 	xlrec.crec.xinfo = initfileinval ? XACT_COMPLETION_UPDATE_RELCACHE_FILE : 0;
-	xlrec.crec.nmsgs = 0;
 	xlrec.crec.nrels = nrels;
 	xlrec.crec.nsubxacts = nchildren;
 	xlrec.crec.nmsgs = ninvalmsgs;
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 06b99e7..603d083 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -225,7 +225,7 @@ DecodeXactOp(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 subxacts = (TransactionId *) (xlrec-xnodes[xlrec-nrels]);
 invals = (SharedInvalidationMessage *) (subxacts[xlrec-nsubxacts]);
 
-DecodeCommit(ctx, buf, r-xl_xid, xlrec-dbId,
+DecodeCommit(ctx, buf, prec-xid, xlrec-dbId,
 			 xlrec-xact_time,
 			 xlrec-nsubxacts, subxacts,
 			 xlrec-nmsgs, invals);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

2014-05-15 Thread Pavel Stehule
Hello


2014-05-15 15:04 GMT+02:00 Sergey Muraviov sergey.k.murav...@gmail.com:

 Hi.
 Please review the new patch.


This version works perfect

Regards

Pavel



 PS
 Issues which were described by Tom and Pavel were relevant to single-line
 headers.
 So I've added appropriate regression tests to the patch.

 I've also attached complex regression tests for unicode linestyle and
 multibyte symbols.


 2014-05-14 10:55 GMT+04:00 Pavel Stehule pavel.steh...@gmail.com:

 sorry

 there is still small issue

 I have a plpgsql function:

 CREATE OR REPLACE FUNCTION public.foo_update_trg()
  RETURNS trigger
  LANGUAGE plpgsql
 AS $function$
 DECLARE t text;
 BEGIN
   EXECUTE format('SELECT $1.%I', TG_ARGV[0]) INTO t USING old;
   RAISE NOTICE 'original value of % is %', TG_ARGV[0], t;
   RETURN NULL;
 END;
 $function$

 Default expanded view of select * from pg_proc where proname =
 'foo_update_trg'; is little bit broken (screenshoot 1)

 After wrap mode, it add useless new line into source code (screenshoot 2)

 but border2 fixes it (screenshots 3)

 Regards

 Pavel



 2014-05-14 8:32 GMT+02:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 With this patch it works perfect

 Thank you

 Regards

 Pavel


 2014-05-13 21:33 GMT+02:00 Sergey Muraviov sergey.k.murav...@gmail.com
 :

 Please check this patch.


 2014-05-12 22:56 GMT+04:00 Sergey Muraviov sergey.k.murav...@gmail.com
 :

 Hi.

 I'll try to fix it tomorrow.


 2014-05-12 18:42 GMT+04:00 Tom Lane t...@sss.pgh.pa.us:

 Greg Stark st...@mit.edu writes:
  On Mon, May 12, 2014 at 2:12 PM, Greg Stark st...@mit.edu wrote:
  Hm, there was an off by one error earlier in some cases, maybe we
  fixed it by breaking other case. Will investigate.

  Those spaces are coming from the ascii wrapping indicators. i.e.
 the periods in:

 Ah.  I wonder whether anyone will complain that the format changed?

  Apparently we used to print those with border=1 in normal mode but
 in
  expanded mode we left out the space for those on the outermost edges
  since there was no need for them. If we put them in for wrapped mode
  then we'll be inconsistent if we don't for nonwrapped mode though.
 And
  if we don't put them in for wrapped mode then there's no way to
  indicate wrapping versus newlines.

 Barring anyone complaining that the format changed, I'd say the issue
 is not that you added them but that the accounting for line length
 fails to include them.

 regards, tom lane




 --
 Best regards,
 Sergey Muraviov




 --
 Best regards,
 Sergey MuraviovH






 --
 Best regards,
 Sergey Muraviov



[HACKERS] pg_dump warnings in MinGW build

2014-05-15 Thread Jeff Janes
Now that popen and pclose don't throw thousands of warnings when compiling
mingw builds, some other warnings stand out.


parallel.c: In function 'pgpipe':
parallel.c:1332:2: warning: overflow in implicit constant conversion
[-Woverflow]
parallel.c:1386:3: warning: overflow in implicit constant conversion
[-Woverflow]

I think the solution is to use the pgsocket typedef from
src/include/port.h, rather than int.  Like attached.

But I'm far from being a typedef lawyer, so maybe I am all wet.

Cheers,

Jeff


pgpipe.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Bruce Momjian
On Mon, May 12, 2014 at 06:01:59PM +0300, Heikki Linnakangas wrote:
 Some of the stuff in here will be influence whether your freezing
 replacement patch gets in. Do you plan to further pursue that one?
 
 Not sure. I got to the point where it seemed to work, but I got a
 bit of a cold feet proceeding with it. I used the page header's LSN
 field to define the epoch of the page, but I started to feel
 uneasy about it. I would be much more comfortable with an extra
 field in the page header, even though that uses more disk space. And
 requires dealing with pg_upgrade.

FYI, pg_upgrade copies pg_clog from the old cluster, so there will be a
pg_upgrade issue anyway.

I am not excited about a 32x increase in clog size, especially since we
already do freezing at 200M transactions to allow for more aggressive
clog trimming.  Extrapolating that out, it means we would freeze every
6.25M transactions.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New timezones used in regression tests

2014-05-15 Thread Bruce Momjian
On Mon, May 12, 2014 at 07:16:48PM -0400, Tom Lane wrote:
 Christoph Berg c...@df7cb.de writes:
  84df54b22e8035addc7108abd9ff6995e8c49264 introduced timestamp
  constructors. In the regression tests, various time zones are tested,
  including America/Metlakatla. Now, if you configure using
  --with-system-tzdata, you'll get an error if that zone isn't there.
  Unfortunately, this is what I'm getting now when trying to build beta1
  on Ubuntu 10.04 (lucid) with tzdata 2010i-1:
 
 I agree, that seems an entirely gratuitous choice of zone.  It does
 seem like a good idea to test a zone that has a nonintegral offset
 from GMT, but we can get that from almost anywhere as long as we're
 testing a pre-1900 date.  There's no need to use any zones that aren't
 long-established and unlikely to change.

If we want a nonintegral offset, why are we not using 'Asia/Calcutta',
which is +5:30 from UTC?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freezing without write I/O

2014-05-15 Thread Robert Haas
On Wed, May 14, 2014 at 8:46 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 +1.  I can't think of many things we might do that would be more
 important.

 Can anyone guess how likely this approach is to make it into 9.5?  I've been
 pondering some incremental improvements over what we have now, but if this
 revolutionary approach has a high chance of landing then any work on
 incremental improvements would be pointless.

Well, Heikki was saying on another thread that he had kind of gotten
cold feet about this, so I gather he's not planning to pursue it.  Not
sure if I understood that correctly.  If so, I guess it depends on
whether someone else can pick it up, but we might first want to
establish why he got cold feet and how worrying those problems seem to
other people.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New timezones used in regression tests

2014-05-15 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Mon, May 12, 2014 at 07:16:48PM -0400, Tom Lane wrote:
 I agree, that seems an entirely gratuitous choice of zone.  It does
 seem like a good idea to test a zone that has a nonintegral offset
 from GMT, but we can get that from almost anywhere as long as we're
 testing a pre-1900 date.  There's no need to use any zones that aren't
 long-established and unlikely to change.

 If we want a nonintegral offset, why are we not using 'Asia/Calcutta',
 which is +5:30 from UTC?

I believe there's already one of those tests that considers a zone like
that.  No, I meant a really odd offset, like Paris' +0:09:21 before they
adopted standardized time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New timezones used in regression tests

2014-05-15 Thread David Fetter
On Tue, May 13, 2014 at 09:55:26AM -0400, Alvaro Herrera wrote:
 Christoph Berg wrote:
 
  Of course, Wikipedia has something to say about this:
  http://en.wikipedia.org/wiki/Timekeeping_on_Mars
 
 Nice.
 
  I especially like MTC, Mars Time Coordinated. But whatever scheme gets
  chosen, it won't be a standard 24h day, so PostgreSQL has a whole lot
  of different problems to solve than to fix that little
  Mars/Mons_Olympus gem now... :)
 
 Maybe a new type, mars_timestamptz()?  Or perhaps the celestial body
 name should be part of the typmod for standard timestamptz ...?

The latter seems a good bit more extensible.

Conversions among the different timestamptzs might be problematic, as
we are currently assuming certain approximations about spacetime that
don't actually hold when we have time zones in significantly different
reference frames.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gettimeofday is at the end of its usefulness?

2014-05-15 Thread Bruce Momjian
On Tue, May 13, 2014 at 06:58:11PM -0400, Tom Lane wrote:
 A recent question from Tim Kane prompted me to measure the overhead
 costs of EXPLAIN ANALYZE, which I'd not checked in awhile.  Things
 are far worse than I thought.  On my current server (by no means
 lavish hardware: Xeon E5-2609 @2.40GHz) a simple seqscan can run
 at something like 110 nsec per row:

I assume you ran pg_test_timing too:

Testing timing overhead for 3 seconds.
Per loop time including overhead: 41.70 nsec
Histogram of timing durations:
 usec   % of total  count
 1 95.83035   68935459
 2  4.169232999133
 4  0.00037268
 8  0.4 31
16  0.0  1
32  0.0  1

My overhead of 41.70 nsec matches yours.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Robert Haas
On Thu, May 15, 2014 at 8:06 AM, Bruce Momjian br...@momjian.us wrote:
 On Tue, May  6, 2014 at 11:15:17PM +0100, Simon Riggs wrote:
  Well, for what it's worth, I've encountered systems where setting
  effective_cache_size too low resulted in bad query plans, but I've
  never encountered the reverse situation.

 I agree with that.

 Though that misses my point, which is that you can't know that all of
 that memory is truly available on a server with many concurrent users.
 Choosing settings that undercost memory intensive plans are not the
 best choice for a default strategy in a mixed workload when cache may
 be better used elsewhere, even if such settings make sense for some
 individual users.

 This is the same problem we had with auto-tuning work_mem, in that we
 didn't know what other concurrent activity was happening.  Seems we need
 concurrent activity detection before auto-tuning work_mem and
 effective_cache_size.

I think it's worse than that: we don't even know what else is
happening *in the same query*.  For example, look at this:

http://www.postgresql.org/message-id/16161.1324414...@sss.pgh.pa.us

That's pretty awful, and it's just one example of a broader class of
problems that we haven't even tried to solve.  We really need a way to
limit memory usage on a per-query basis rather than a per-node basis.
For example, consider a query plan that needs to do four sorts.  If
work_mem = 64MB, we'll happily use 256MB total, 64MB for each sort.
Now, that might cause the system to swap: since there are four sorts,
maybe we ought to have used only 16MB per sort, and switched to a heap
sort if that wasn't enough.  But it's even subtler than that: if we
had known when building the query plan that we only had 16MB per sort
rather than 64MB per sort, we would potentially have estimated higher
costs for those sorts in the first place, which might have led to a
different plan that needed fewer sorts to begin with.

When you start to try to balance memory usage across multiple
backends, things get even more complicated.  If the first query that
starts up is allowed to use all the available memory, and we respond
to that by lowering the effective value of work_mem to something very
small, a second query that shows up a bit later might choose a very
inefficient plan as a result.  That in turn might cause heavy I/O load
on the system for a long time, making the first query run very slowly.
 We might have been better off just letting the first query finish,
and the running the second one (with a much better plan) after it was
done.  Or, maybe we should have only let the first query take a
certain fraction (half? 10%?) of the available memory, so that there
was more left for the second guy.  But that could be wrong too - it
might cause the first plan to be unnecessarily inefficient when nobody
was planning to run any other queries anyway.  Plus, DBAs hate it when
plans change on them unexpectedly, so anything that involves a
feedback loop between current utilization and query plans will be
unpopular with some people for that reason.

These are hard problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Race condition between PREPARE TRANSACTION and COMMIT PREPARED (was Re: [HACKERS] Problem with txid_snapshot_in/out() functionality)

2014-05-15 Thread Robert Haas
On Thu, May 15, 2014 at 10:38 AM, Andres Freund and...@2ndquadrant.com wrote:
 shrug. async.c and namespace.c does the same, and it hasn't been a
 problem.

 Well, it doesn't seem unreasonable to have C code using
 PG_ENSURE_ERROR_CLEANUP/PG_END_ENSURE_ERROR_CLEANUP around a 2pc commit
 to me. That'll break with this.
 Perhaps we should just finally make cancel_before_shmem_exit search the
 stack of callbacks.

Yes, please.  And while we're at it, perhaps we should make it Trap()
or fail an Assert() if it doesn't find the callback it was told to
remove.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] autovacuum scheduling starvation and frenzy

2014-05-15 Thread Jeff Janes
In testing 9.4 with some long running tests, I noticed that autovacuum
launcher/worker sometimes goes a bit nuts.  It vacuums the same database
repeatedly without respect to the nap time.

As far as I can tell, the behavior is the same in older versions, but I
haven't tested that.

This is my understanding of what is happening:

If you have a database with a large table in it that has just passed
autovacuum_freeze_max_age, all future workers will be funnelled into that
database until the wrap-around completes.  But only one of those workers
can actually vacuum the one table which is holding back the frozenxid.
Maybe the 2nd worker to come along will find other useful work to do, but
eventually all the vacuuming that needs doing is already in progress, and
so each worker starts up, gets directed to this database, finds it can't
help, and exits.  So all other databases are entirely starved of
autovacuuming for the entire duration of the wrap-around vacuuming of this
one large table.

Also, the launcher decides when to launch the next worker by looking at the
scheduled time of the least-recently-vacuumed database (with the implicit
intention that that is the one that will get chosen to vacuum next).  But
since the worker gets redirected to the wrap-around database instead of the
least-recently-vacuumed database, the least-recently-vacuumed database
never gets it schedule updated and always looks like it is chronologically
overdue.  That means the launcher keeps launching new workers as fast as
the previous ones exit, ignoring the nap time. So there is one long running
worker actually making progress, plus a frenzy of workers all attacking the
same database, finding that there is nothing they can do.

I think that a database more than autovacuum_freeze_max_age should get
first priority, but only if its next scheduled vacuum time is in the past.
 If it can beneficially use more than one vacuum worker, they would usually
accumulate there naturally within a few naptimes iterations[1].  And if it
can't usefully use more than one worker, don't prevent other databases from
using them.

[1] you could argue that all other max_workers processes could become
pinned down in long running vacuums of other nonrisk databases between the
time that the database crosses autovacuum_freeze_max_age (and has its first
worker started), and the time its nap time expires and so it becomes
eligible for a second one.  But that seems like a weak argument, as it
could just have easily happened that all of them got pinned down in nonrisk
databases a few transactions *before* the database crosses
autovacuum_freeze_max_age in the first place.

Does this analysis and proposal seem sound?

Cheers,

Jeff


Re: [HACKERS] New timezones used in regression tests

2014-05-15 Thread Bruce Momjian
On Thu, May 15, 2014 at 02:47:21PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Mon, May 12, 2014 at 07:16:48PM -0400, Tom Lane wrote:
  I agree, that seems an entirely gratuitous choice of zone.  It does
  seem like a good idea to test a zone that has a nonintegral offset
  from GMT, but we can get that from almost anywhere as long as we're
  testing a pre-1900 date.  There's no need to use any zones that aren't
  long-established and unlikely to change.
 
  If we want a nonintegral offset, why are we not using 'Asia/Calcutta',
  which is +5:30 from UTC?
 
 I believe there's already one of those tests that considers a zone like
 that.  No, I meant a really odd offset, like Paris' +0:09:21 before they
 adopted standardized time.

Wow, OK, got it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Heikki Linnakangas

On 05/15/2014 08:46 PM, Andres Freund wrote:

On 2014-05-15 20:07:23 +0300, Heikki Linnakangas wrote:

Ok, so the immediate cause was quick to find: when decoding a
commit-prepared WAL record, we have to use the XID from the record content
(patch attached). The XID in the record header is the XID of the transaction
doing the COMMIT PREPARED, which is always 0 after patch
bb38fb0d43c8d7ff54072bfd8bd63154e536b384 which causes the assertion. But it
was always wrong. After fixing, it no longer asserts or gives the above
could not map filenode error.

However, it still doesn't seem right. When I do the above as a regular
transaction, ie:

begin; insert into foo values (6);  alter table foo alter column id type
text; commit;

pg_recvlogical prints this:

BEGIN 708
table public.foo: INSERT: id[text]:'6'
COMMIT 708

But if I do it as a prepared transaction:

begin; insert into foo values (7);  alter table foo alter column id type
text; prepare transaction 'foo'; commit prepared 'foo';


How very wierd. The reason for this is that
RecordTransactionCommitPrepared() forgets to fill a couple of fields in
xl_xact_commit. Any reason dbId/tsId aren't filled? They aren't strictly
needed because afaics they're only looked at for relcache invalidations
which prepared xacts don't support, but still?


Seems like an oversight in commit dd428c79, which added the fields to 
xl_xact_commit. They are needed. A prepared xact can indeed cause 
relcache invalidations, and removal of the init file. For example:


 begin; cluster pg_opclass using pg_opclass_oid_index ; prepare 
transaction 'foo'; commit prepared 'foo';



Attached patch fixes things, but I want to add some regression tests
before commit.


Looks good to me.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Robert Haas
On Thu, May 15, 2014 at 2:34 PM, Bruce Momjian br...@momjian.us wrote:
 On Mon, May 12, 2014 at 06:01:59PM +0300, Heikki Linnakangas wrote:
 Some of the stuff in here will be influence whether your freezing
 replacement patch gets in. Do you plan to further pursue that one?

 Not sure. I got to the point where it seemed to work, but I got a
 bit of a cold feet proceeding with it. I used the page header's LSN
 field to define the epoch of the page, but I started to feel
 uneasy about it. I would be much more comfortable with an extra
 field in the page header, even though that uses more disk space. And
 requires dealing with pg_upgrade.

 FYI, pg_upgrade copies pg_clog from the old cluster, so there will be a
 pg_upgrade issue anyway.

 I am not excited about a 32x increase in clog size, especially since we
 already do freezing at 200M transactions to allow for more aggressive
 clog trimming.  Extrapolating that out, it means we would freeze every
 6.25M transactions.

It seems better to allow clog to grow larger than to force
more-frequent freezing.

If the larger clog size is a show-stopper (and I'm not sure I have an
intelligent opinion on that just yet), one way to get around the
problem would be to summarize CLOG entries after-the-fact.  Once an
XID precedes the xmin of every snapshot, we don't need to know the
commit LSN any more.  So we could read the old pg_clog files and write
new summary files.  Since we don't need to care about subcommitted
transactions either, we could get by with just 1 bit per transaction,
1 = committed, 0 = aborted.  Once we've written and fsync'd the
summary files, we could throw away the original files.  That might
leave us with a smaller pg_clog than what we have today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum scheduling starvation and frenzy

2014-05-15 Thread Alvaro Herrera
Jeff Janes wrote:

 If you have a database with a large table in it that has just passed
 autovacuum_freeze_max_age, all future workers will be funnelled into that
 database until the wrap-around completes.  But only one of those workers
 can actually vacuum the one table which is holding back the frozenxid.
 Maybe the 2nd worker to come along will find other useful work to do, but
 eventually all the vacuuming that needs doing is already in progress, and
 so each worker starts up, gets directed to this database, finds it can't
 help, and exits.  So all other databases are entirely starved of
 autovacuuming for the entire duration of the wrap-around vacuuming of this
 one large table.

Bah.  Of course :-(

Note that if you have two databases in danger of wraparound, the oldest
will always be chosen until it's no longer in danger.  Ignoring the
second one past freeze_max_age seems bad also.

This code is in autovacuum.c, do_start_worker().  Not sure what does
your proposal look like in terms of code.  I think that instead of
trying to get a single target database in that foreach loop, we could
try to build a prioritized list (in-wraparound-danger first, then
in-multixid-wraparound danger, then the one with the oldest autovac time
of all the ones that remain); then recheck the wrap-around condition by
seeing whether there are other workers in that database that started
after the wraparound condition appeared.  If there are, move down the
list.  The first in the list not skipped is chosen for vacuuming.

(Do we need to consider the situation that all databases were skipped by
the above logic, and if so then perhaps pick up the first DB in the
list?)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSOC13 proposal - extend RETURNING syntax

2014-05-15 Thread Arthur Axel 'fREW' Schmidt
Andres Freund andres at anarazel.de writes:

 
 Hi,
 
 Some comments about the patch:
 * Coding Style:
   * multiline comments have both /* and */ on their own lines.
   * I think several places indent by two tabs.
   * Spaces around operators
   * ...
 * Many of the new comments would enjoy a bit TLC by a native speaker.
 
 * The way RTE_ALIAS creeps in many place where it doesn't seem to belong
   seems to indicate that the design isn't yet ready. I share Robert's
   suspicion that this would be better solved by referring to a special
   range table entry.
 
 Based on the lack of activity around this and the fact that this needs a
 *significant* chunk of work before being committable, I am going to mark
 this as returned with feedback.

I'm actively working towards converting our software at work to use Pg
instead of SQL Server and before we switch we'll need this feature to be
merged.  I'll do what I can to get the verbage and style whipped into shape,
though I doubt I can do much with the actual code.  Hopefully I can get
something in soon.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Tomas Vondra
On 15 Květen 2014, 19:46, Andrew Dunstan wrote:

 On 05/15/2014 12:43 PM, Tomas Vondra wrote:
 Hi all,

 today I got a few of errors like these (this one is from last week,
 though):

 Status Line: 493 snapshot too old: Wed May  7 04:36:57 2014 GMT
 Content:
 snapshot to old: Wed May  7 04:36:57 2014 GMT

 on the new buildfarm animals. I believe it was my mistake (incorrectly
 configured local git mirror), but it got me thinking about how this will
 behave with the animals running CLOBBER_CACHE_RECURSIVELY.

 If I understand the Perl code correctly, it does this:

 (1) update the repository
 (2) run the tests
 (3) check that the snapshot is not older than 24 hours (pgstatus.pl:188)
 (4) fail if older

 Now, imagine that the test runs for days/weeks. This pretty much means
 it's wasted, because the results will be thrown away anyway, no?



 The 24 hours runs from the time of the latest commit on the branch in
 question, not the current time, but basically yes.

 We've never had machines with runs that long. The longest in recent
 times has been friarbird, which runs CLOBBER_CACHE_ALWAYS and takes
 around 4.5 hours. But we have had misconfigured machines reporting
 unbelievable snapshot times.  I'll take a look and see if we can tighten
 up the sanity check. It's worth noting that one thing friarbird does is
 skip the install-check stage - it's almost certainly not going to have
 terribly much interesting to tell us from that, given it has already run
 a plain make check.

 How long does a CLOBBER_CACHE_RECURSIVELY run take? days or weeks seems
 kinda nuts.

I don't know. According to this comment from cache/inval.c, it's expected
to be way slower (~100x) compared to CLOBBER_CACHE_ALWAYS.

/*
 * Test code to force cache flushes anytime a flush could happen.
 *
 * If used with CLOBBER_FREED_MEMORY, CLOBBER_CACHE_ALWAYS provides a
 * fairly thorough test that the system contains no cache-flush hazards.
 * However, it also makes the system unbelievably slow --- the regression
 * tests take about 100 times longer than normal.
 *
 * If you're a glutton for punishment, try CLOBBER_CACHE_RECURSIVELY. This
 * slows things by at least a factor of 1, so I wouldn't suggest
 * trying to run the entire regression tests that way.It's useful to try
 * a few simple tests, to make sure that cache reload isn't subject to
 * internal cache-flush hazards, but after you've done a few thousand
 * recursive reloads it's unlikely you'll learn more.
 */

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Andres Freund
On 2014-05-15 22:30:53 +0300, Heikki Linnakangas wrote:
 On 05/15/2014 08:46 PM, Andres Freund wrote:
 On 2014-05-15 20:07:23 +0300, Heikki Linnakangas wrote:
 How very wierd. The reason for this is that
 RecordTransactionCommitPrepared() forgets to fill a couple of fields in
 xl_xact_commit. Any reason dbId/tsId aren't filled? They aren't strictly
 needed because afaics they're only looked at for relcache invalidations
 which prepared xacts don't support, but still?
 
 Seems like an oversight in commit dd428c79, which added the fields to
 xl_xact_commit. They are needed. A prepared xact can indeed cause relcache
 invalidations, and removal of the init file. For example:

Hm, so that part has to be backpatched to 9.0. Ick, I wonder if that's
been hit in production. Seems like it could cause pretty random looking
errors. It's easy enough to cause errors like:
ERROR:  could not open file base/12753/12613: No such file or directory
I guess not many people will do relevant stuff in prepared xacts tho.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Andres Freund
On 2014-05-15 15:40:06 -0400, Robert Haas wrote:
 On Thu, May 15, 2014 at 2:34 PM, Bruce Momjian br...@momjian.us wrote:
  On Mon, May 12, 2014 at 06:01:59PM +0300, Heikki Linnakangas wrote:
  Some of the stuff in here will be influence whether your freezing
  replacement patch gets in. Do you plan to further pursue that one?
 
  Not sure. I got to the point where it seemed to work, but I got a
  bit of a cold feet proceeding with it. I used the page header's LSN
  field to define the epoch of the page, but I started to feel
  uneasy about it. I would be much more comfortable with an extra
  field in the page header, even though that uses more disk space. And
  requires dealing with pg_upgrade.
 
  FYI, pg_upgrade copies pg_clog from the old cluster, so there will be a
  pg_upgrade issue anyway.
 
  I am not excited about a 32x increase in clog size, especially since we
  already do freezing at 200M transactions to allow for more aggressive
  clog trimming.  Extrapolating that out, it means we would freeze every
  6.25M transactions.

The default setting imo is far too low for a database of any relevant
activity. If I had the stomach for the fight around it I'd suggest
increasing it significantly by default. People with small databases
won't be hurt significantly because they simply don't have that many
transactions and autovacuum will get around to cleanup long before
normally.

 It seems better to allow clog to grow larger than to force
 more-frequent freezing.

Yes.

 If the larger clog size is a show-stopper (and I'm not sure I have an
 intelligent opinion on that just yet), one way to get around the
 problem would be to summarize CLOG entries after-the-fact.  Once an
 XID precedes the xmin of every snapshot, we don't need to know the
 commit LSN any more.  So we could read the old pg_clog files and write
 new summary files.  Since we don't need to care about subcommitted
 transactions either, we could get by with just 1 bit per transaction,
 1 = committed, 0 = aborted.  Once we've written and fsync'd the
 summary files, we could throw away the original files.  That might
 leave us with a smaller pg_clog than what we have today.

I think the easiest way for now would be to have pg_clog with the same
format as today and a rangewise much smaller pg_csn storing the lsns
that are needed. That'll leave us with pg_upgrade'ability without
needing to rewrite pg_clog during the upgrade.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Andrew Dunstan


On 05/15/2014 03:57 PM, Tomas Vondra wrote:

How long does a CLOBBER_CACHE_RECURSIVELY run take? days or weeks seems
kinda nuts.

I don't know. According to this comment from cache/inval.c, it's expected
to be way slower (~100x) compared to CLOBBER_CACHE_ALWAYS.

/*
  * Test code to force cache flushes anytime a flush could happen.
  *
  * If used with CLOBBER_FREED_MEMORY, CLOBBER_CACHE_ALWAYS provides a
  * fairly thorough test that the system contains no cache-flush hazards.
  * However, it also makes the system unbelievably slow --- the regression
  * tests take about 100 times longer than normal.
  *
  * If you're a glutton for punishment, try CLOBBER_CACHE_RECURSIVELY. This
  * slows things by at least a factor of 1, so I wouldn't suggest
  * trying to run the entire regression tests that way.It's useful to try
  * a few simple tests, to make sure that cache reload isn't subject to
  * internal cache-flush hazards, but after you've done a few thousand
  * recursive reloads it's unlikely you'll learn more.
  */



Yes, I've seen that. Frankly, a test that takes something like 500 hours 
is a bit crazy.


If we really want to run this in the buildfarm we should probably try to 
create a massively cut down test schedule for use in this case.


Incidentally, should the CLOBBER_CACHE_ALWAYS machines also be defining 
CLOBBER_FREED_MEMORY?


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Bruce Momjian
On Thu, May 15, 2014 at 10:06:32PM +0200, Andres Freund wrote:
  If the larger clog size is a show-stopper (and I'm not sure I have an
  intelligent opinion on that just yet), one way to get around the
  problem would be to summarize CLOG entries after-the-fact.  Once an
  XID precedes the xmin of every snapshot, we don't need to know the
  commit LSN any more.  So we could read the old pg_clog files and write
  new summary files.  Since we don't need to care about subcommitted
  transactions either, we could get by with just 1 bit per transaction,
  1 = committed, 0 = aborted.  Once we've written and fsync'd the
  summary files, we could throw away the original files.  That might
  leave us with a smaller pg_clog than what we have today.
 
 I think the easiest way for now would be to have pg_clog with the same
 format as today and a rangewise much smaller pg_csn storing the lsns
 that are needed. That'll leave us with pg_upgrade'ability without
 needing to rewrite pg_clog during the upgrade.

Yes, I like the idea of storing the CSN separately.  One reason the
2-bit clog is so good is that we know we have atomic 1-byte writes on
all platforms.  Can we assume atomic 64-bit writes?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Incidentally, should the CLOBBER_CACHE_ALWAYS machines also be defining 
 CLOBBER_FREED_MEMORY?

The former does need the latter or it's not very thorough.  However,
CLOBBER_FREED_MEMORY is defined automatically by --enable-cassert,
so you shouldn't need to use a -D switch for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Andres Freund
On 2014-05-15 16:13:49 -0400, Bruce Momjian wrote:
 On Thu, May 15, 2014 at 10:06:32PM +0200, Andres Freund wrote:
   If the larger clog size is a show-stopper (and I'm not sure I have an
   intelligent opinion on that just yet), one way to get around the
   problem would be to summarize CLOG entries after-the-fact.  Once an
   XID precedes the xmin of every snapshot, we don't need to know the
   commit LSN any more.  So we could read the old pg_clog files and write
   new summary files.  Since we don't need to care about subcommitted
   transactions either, we could get by with just 1 bit per transaction,
   1 = committed, 0 = aborted.  Once we've written and fsync'd the
   summary files, we could throw away the original files.  That might
   leave us with a smaller pg_clog than what we have today.
  
  I think the easiest way for now would be to have pg_clog with the same
  format as today and a rangewise much smaller pg_csn storing the lsns
  that are needed. That'll leave us with pg_upgrade'ability without
  needing to rewrite pg_clog during the upgrade.
 
 Yes, I like the idea of storing the CSN separately.  One reason the
 2-bit clog is so good is that we know we have atomic 1-byte writes on
 all platforms.

I don't think we rely on that anywhere. And in fact we don't have the
ability to do so for arbitrary bytes - lots of platforms can do that
only on specifically aligned bytes.

We rely on being able to atomically (as in either before/after no torn
value) write/read TransactionIds, but that's it I think?

  Can we assume atomic 64-bit writes?

Not on 32bit platforms.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Stefan Kaltenbrunner
On 05/15/2014 07:46 PM, Andrew Dunstan wrote:
 
 On 05/15/2014 12:43 PM, Tomas Vondra wrote:
 Hi all,

 today I got a few of errors like these (this one is from last week,
 though):

 Status Line: 493 snapshot too old: Wed May  7 04:36:57 2014 GMT
 Content:
 snapshot to old: Wed May  7 04:36:57 2014 GMT

 on the new buildfarm animals. I believe it was my mistake (incorrectly
 configured local git mirror), but it got me thinking about how this will
 behave with the animals running CLOBBER_CACHE_RECURSIVELY.

 If I understand the Perl code correctly, it does this:

 (1) update the repository
 (2) run the tests
 (3) check that the snapshot is not older than 24 hours (pgstatus.pl:188)
 (4) fail if older

 Now, imagine that the test runs for days/weeks. This pretty much means
 it's wasted, because the results will be thrown away anyway, no?

 
 
 The 24 hours runs from the time of the latest commit on the branch in
 question, not the current time, but basically yes.
 
 We've never had machines with runs that long. The longest in recent
 times has been friarbird, which runs CLOBBER_CACHE_ALWAYS and takes
 around 4.5 hours. But we have had misconfigured machines reporting
 unbelievable snapshot times.  I'll take a look and see if we can tighten
 up the sanity check. It's worth noting that one thing friarbird does is
 skip the install-check stage - it's almost certainly not going to have
 terribly much interesting to tell us from that, given it has already run
 a plain make check.

well I'm not sure about about misconfigured but both my personal
buildfarm members and pginfra run ones (like gaibasaurus) got errors
complaining about snapshot too old in the past for long running tests
so I'm not sure it is really a we never had machine with runs that
long. So maybe we should not reject those submissions at submission
time but rather mark them clearly on the dashboard and leave the final
interpretation to a human...




Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Andrew Dunstan


On 05/15/2014 04:30 PM, Stefan Kaltenbrunner wrote:

On 05/15/2014 07:46 PM, Andrew Dunstan wrote:

On 05/15/2014 12:43 PM, Tomas Vondra wrote:

Hi all,

today I got a few of errors like these (this one is from last week,
though):

 Status Line: 493 snapshot too old: Wed May  7 04:36:57 2014 GMT
 Content:
 snapshot to old: Wed May  7 04:36:57 2014 GMT

on the new buildfarm animals. I believe it was my mistake (incorrectly
configured local git mirror), but it got me thinking about how this will
behave with the animals running CLOBBER_CACHE_RECURSIVELY.

If I understand the Perl code correctly, it does this:

(1) update the repository
(2) run the tests
(3) check that the snapshot is not older than 24 hours (pgstatus.pl:188)
(4) fail if older

Now, imagine that the test runs for days/weeks. This pretty much means
it's wasted, because the results will be thrown away anyway, no?



The 24 hours runs from the time of the latest commit on the branch in
question, not the current time, but basically yes.

We've never had machines with runs that long. The longest in recent
times has been friarbird, which runs CLOBBER_CACHE_ALWAYS and takes
around 4.5 hours. But we have had misconfigured machines reporting
unbelievable snapshot times.  I'll take a look and see if we can tighten
up the sanity check. It's worth noting that one thing friarbird does is
skip the install-check stage - it's almost certainly not going to have
terribly much interesting to tell us from that, given it has already run
a plain make check.

well I'm not sure about about misconfigured but both my personal
buildfarm members and pginfra run ones (like gaibasaurus) got errors
complaining about snapshot too old in the past for long running tests
so I'm not sure it is really a we never had machine with runs that
long. So maybe we should not reject those submissions at submission
time but rather mark them clearly on the dashboard and leave the final
interpretation to a human...





That's a LOT harder and more work to arrange. Frankly, there are more 
important things to do.


I would like to know the circumstances of these very long runs. I drive 
some of my VMs pretty hard on pretty modest hardware, and they don't 
come close to running 24 hours.


The current behaviour goes back to this commit from December 2011:

   commit a8b5049e64f9cb08f8e165d0737139dab74e3bce
   Author: Andrew Dunstan and...@dunslane.net
   Date:   Wed Dec 14 14:38:44 2011 -0800

Use git snapshot instead of fixed 10 day timeout.

The sanity checks made sure that an animal wasn't submitting a
snapshot that was too old. But sometimes an old branch doesn't
get any changes for more than 10 days. So accept a snapshot that
is not more than 1 day older than the last known snapshot. Per
complaint from Stefan.


I'm prepared to increase the sanity check time if there is a serious 
demand for it, but I'd like to know what to increase it to.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical replication woes

2014-05-15 Thread Andres Freund
On 2014-05-15 22:30:53 +0300, Heikki Linnakangas wrote:
 Attached patch fixes things, but I want to add some regression tests
 before commit.
 
 Looks good to me.

Attached are two patches. One for the unitialized dbId/tsId issue; one
for the decoding bug. The former should be backpatched.
Should you wonder about the slight reordering of the assignments in
RecordTransactionCommitPrepared() - I've made it more similar to 
RecordTransactionCommit()
to make it easier to see they are equivalent.

Thanks for the testing!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 07d0f4330f7c25bcec9e356527b0dc86372d2886 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 15 May 2014 22:23:12 +0200
Subject: [PATCH 1/2] Initialize all members of xl_xact_commit during prepared
 transaction commits.

Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but
missed that prepared transaction commits reuse that struct. Fix that.

Because those fields were used WAL logged unitialized a hot standby
node could miss relcache init file invalidations leading to errors like
ERROR:  could not open file ...: No such file or directory
on the standby. A restart of the database is sufficient to fix the
problem.
As problems can only be triggered when a system table/index has been
rewritten in a transaction using two phase commit the problem is
unlikely to have affected many installations.

Found while investigating a logical decoding bugreport from Heikki.

Backpatch to 9.0 where the bug was introduced.
---
 src/backend/access/transam/twophase.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 2cefa08..d5409a6 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -2082,9 +2082,13 @@ RecordTransactionCommitPrepared(TransactionId xid,
 
 	/* Emit the XLOG commit record */
 	xlrec.xid = xid;
-	xlrec.crec.xact_time = GetCurrentTimestamp();
+
 	xlrec.crec.xinfo = initfileinval ? XACT_COMPLETION_UPDATE_RELCACHE_FILE : 0;
-	xlrec.crec.nmsgs = 0;
+
+	xlrec.crec.dbId = MyDatabaseId;
+	xlrec.crec.tsId = MyDatabaseTableSpace;
+
+	xlrec.crec.xact_time = GetCurrentTimestamp();
 	xlrec.crec.nrels = nrels;
 	xlrec.crec.nsubxacts = nchildren;
 	xlrec.crec.nmsgs = ninvalmsgs;
-- 
2.0.0.rc2.4.g1dc51c6.dirty

From de22d8e5308b1c4c509d8566317d9aea0956c2bc Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Thu, 15 May 2014 22:42:27 +0200
Subject: [PATCH 2/2] Fix typo-induced bug in decoding of prepared
 transactions.

The decoding of prepared transaction commits accidentally used the xid
of the transaction performing the COMMIT PREPARED. Before
bb38fb0d43c8d that lead to those transactions not being decoded,
afterwards to a assertion failure.
Add tests for the decoding of prepared transactions.

Bug found and fixed by Heikki Linnakangas; new regression test by
Andres Freund.
---
 contrib/test_decoding/Makefile  |  2 +-
 contrib/test_decoding/expected/ddl.out  |  6 +--
 contrib/test_decoding/expected/prepared.out | 82 +
 contrib/test_decoding/sql/ddl.sql   |  2 +-
 contrib/test_decoding/sql/prepared.sql  | 50 ++
 src/backend/replication/logical/decode.c|  2 +-
 6 files changed, 138 insertions(+), 6 deletions(-)
 create mode 100644 contrib/test_decoding/expected/prepared.out
 create mode 100644 contrib/test_decoding/sql/prepared.sql

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 685986c..58e0f38 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -37,7 +37,7 @@ submake-isolation:
 submake-test_decoding:
 	$(MAKE) -C $(top_builddir)/contrib/test_decoding
 
-REGRESSCHECKS=ddl rewrite toast permissions decoding_in_xact binary
+REGRESSCHECKS=ddl rewrite toast permissions decoding_in_xact binary prepared
 
 regresscheck: all | submake-regress submake-test_decoding
 	$(MKDIR_P) regression_output
diff --git a/contrib/test_decoding/expected/ddl.out b/contrib/test_decoding/expected/ddl.out
index 05a4bd3..37ff8b7 100644
--- a/contrib/test_decoding/expected/ddl.out
+++ b/contrib/test_decoding/expected/ddl.out
@@ -640,8 +640,8 @@ SELECT pg_drop_replication_slot('regression_slot');
 (1 row)
 
 /* check that we aren't visible anymore now */
-SELECT * FROM pg_stat_replication;
- pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state 

[HACKERS] pg_recvlogical tests

2014-05-15 Thread Andres Freund
Hi,

I have some preliminary tests for the pg_recvlogical binary using the
infrastructure Peter added. I am wondering if somebody has a good idea
about how to make the tests more meaningful. Currently all that's tested
are simple commands. Not the main functionality namely the actual
streaming of changes.
I wonder if somebody has a great idea for cancelling pg_recvlogical
after a while. Right now my best idea is to add a new
--stop-after-messages parameter. Does anybody have a better idea?

If that's the way, would somebody object to tests and the parameter
being added now? The potential harm seems pretty low and the additional
tests would cover the walsender interface that's not covered by any
tests right now...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CTE that result in repeated sorting of the data

2014-05-15 Thread Jon Nelson
I was watching a very large recursive CTE get built today and this CTE
involves on the order of a dozen or so loops joining the initial
table against existing tables. It struck me that - every time through
the loop the tables were sorted and then joined and that it would be
much more efficient if the tables remained in a sorted state and could
avoid being re-sorted each time through the loop. Am I missing
something here? I am using PG 8.4 if that matters.

-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Alvaro Herrera
Andres Freund wrote:
 On 2014-05-15 15:40:06 -0400, Robert Haas wrote:
  On Thu, May 15, 2014 at 2:34 PM, Bruce Momjian br...@momjian.us wrote:

  If the larger clog size is a show-stopper (and I'm not sure I have an
  intelligent opinion on that just yet), one way to get around the
  problem would be to summarize CLOG entries after-the-fact.  Once an
  XID precedes the xmin of every snapshot, we don't need to know the
  commit LSN any more.  So we could read the old pg_clog files and write
  new summary files.  Since we don't need to care about subcommitted
  transactions either, we could get by with just 1 bit per transaction,
  1 = committed, 0 = aborted.  Once we've written and fsync'd the
  summary files, we could throw away the original files.  That might
  leave us with a smaller pg_clog than what we have today.
 
 I think the easiest way for now would be to have pg_clog with the same
 format as today and a rangewise much smaller pg_csn storing the lsns
 that are needed. That'll leave us with pg_upgrade'ability without
 needing to rewrite pg_clog during the upgrade.

Err, we're proposing a patch to add timestamps to each commit,
http://www.postgresql.org/message-id/2013101600.ge4...@eldon.alvh.no-ip.org
which does so in precisely this way.

The idea that pg_csn or pg_committs can be truncated much earlier than
pg_clog has its merit, no doubt.  If we can make sure that the atomicity
is sane, +1 from me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Andres Freund
On 2014-05-15 17:37:14 -0400, Alvaro Herrera wrote:
 Andres Freund wrote:
  On 2014-05-15 15:40:06 -0400, Robert Haas wrote:
   On Thu, May 15, 2014 at 2:34 PM, Bruce Momjian br...@momjian.us wrote:
 
   If the larger clog size is a show-stopper (and I'm not sure I have an
   intelligent opinion on that just yet), one way to get around the
   problem would be to summarize CLOG entries after-the-fact.  Once an
   XID precedes the xmin of every snapshot, we don't need to know the
   commit LSN any more.  So we could read the old pg_clog files and write
   new summary files.  Since we don't need to care about subcommitted
   transactions either, we could get by with just 1 bit per transaction,
   1 = committed, 0 = aborted.  Once we've written and fsync'd the
   summary files, we could throw away the original files.  That might
   leave us with a smaller pg_clog than what we have today.
  
  I think the easiest way for now would be to have pg_clog with the same
  format as today and a rangewise much smaller pg_csn storing the lsns
  that are needed. That'll leave us with pg_upgrade'ability without
  needing to rewrite pg_clog during the upgrade.
 
 Err, we're proposing a patch to add timestamps to each commit,
 http://www.postgresql.org/message-id/2013101600.ge4...@eldon.alvh.no-ip.org
 which does so in precisely this way.

I am not sure where my statements above conflict with committs?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

2014-05-15 Thread Jeff Janes
On Wed, May 14, 2014 at 8:26 AM, Robert Haas robertmh...@gmail.com wrote:

 On Sun, May 11, 2014 at 12:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  On 11 May 2014 11:18, Andres Freund and...@2ndquadrant.com wrote:
  I don't know. I'd find UPDATE/DELETE ORDER BY something rather
  useful.
 
  Perhaps if an index exists to provide an ordering that makes it clear
  what this means, then yes.
 
  The $64 question is whether we'd accept an implementation that fails
  if the target table has children (ie, is partitioned).

 I'd say no.  Partitioning is important, and we need to make it more
 seamless and better-integrated, not add new warts.



I think the importance of partitioning argues the other way on this issue.
 Where I most wanted a LIMIT clause on DELETE is where I was moving tuples
from one partition to a different one in a transactional way using
bite-size chunks that wouldn't choke the live system with locks or with IO.


So the DELETE was always running against either a child by name, or against
ONLY parent, not against the whole inheritance tree.  Not being able to do
this on single partitions makes partitioning harder, not easier.

Sure, I can select the nth smallest ctid and then WITH T AS (DELETE FROM
ONLY foo WHERE ctid  :that RETURNING *) INSERT INTO bar SELECT * from T,
but how annoying.



  That seems
  to me to not be up to the project's usual quality expectations, but
  maybe if there's enough demand for a partial solution we should do so.

 I like this feature, but if I were searching for places where it makes
 sense to loosen our project's usual quality expectations, this isn't
 where I'd start.


In this case I'd much rather have half a loaf rather than none at all.  We
wouldn't be adding warts to partitioning, but removing warts from the
simpler case.

Cheers,

Jeff


Re: [HACKERS] CTE that result in repeated sorting of the data

2014-05-15 Thread David G Johnston
Jon Nelson-14 wrote
 I was watching a very large recursive CTE get built today and this CTE
 involves on the order of a dozen or so loops joining the initial
 table against existing tables. It struck me that - every time through
 the loop the tables were sorted and then joined and that it would be
 much more efficient if the tables remained in a sorted state and could
 avoid being re-sorted each time through the loop. Am I missing
 something here? I am using PG 8.4 if that matters.

I'm not sure what you mean by watching but maybe this is a simple as
changing your CTE to use UNION ALL instead of UNION [DISTINCT]?

If you really think it could be improved upon maybe you can help and provide
a minimal self-contained example query and data that exhibits the behavior
you describe so others can see it and test changes?  It would be nice to
know if other versions than one that is basically no longer supported
exhibits the same behavior.

Or maybe someone more familiar with the implementation of recursive CTE will
chime in - my knowledge in this area is fairly limited.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/CTE-that-result-in-repeated-sorting-of-the-data-tp5804136p5804140.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_recvlogical tests

2014-05-15 Thread Euler Taveira
On 15-05-2014 18:09, Andres Freund wrote:
 I have some preliminary tests for the pg_recvlogical binary using the
 infrastructure Peter added. I am wondering if somebody has a good idea
 about how to make the tests more meaningful. Currently all that's tested
 are simple commands. Not the main functionality namely the actual
 streaming of changes.

Could you post your preliminary patch?

 I wonder if somebody has a great idea for cancelling pg_recvlogical
 after a while. Right now my best idea is to add a new
 --stop-after-messages parameter. Does anybody have a better idea?
 
pgbench uses two stop conditions: (i) number of transactions and (ii)
time. These could be applied to pg_recvlogical too.

--transactions=NUM
--time=NUM

 If that's the way, would somebody object to tests and the parameter
 being added now? The potential harm seems pretty low and the additional
 tests would cover the walsender interface that's not covered by any
 tests right now...
 
beta is not the right time to add even a *minor* functionality. Let's do
it for 9.5.


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_recvlogical tests

2014-05-15 Thread Andres Freund
On 2014-05-15 18:52:45 -0300, Euler Taveira wrote:
 On 15-05-2014 18:09, Andres Freund wrote:
  I have some preliminary tests for the pg_recvlogical binary using the
  infrastructure Peter added. I am wondering if somebody has a good idea
  about how to make the tests more meaningful. Currently all that's tested
  are simple commands. Not the main functionality namely the actual
  streaming of changes.
 
 Could you post your preliminary patch?

It's not particularly interesting yet, but attached.

  I wonder if somebody has a great idea for cancelling pg_recvlogical
  after a while. Right now my best idea is to add a new
  --stop-after-messages parameter. Does anybody have a better idea?
  
 pgbench uses two stop conditions: (i) number of transactions and (ii)
 time. These could be applied to pg_recvlogical too.
 
 --transactions=NUM
 --time=NUM

Sounds too complicated for what I need it for. The former isn't easily
implementable because pg_recvlogical doesn't know about transactions and
the latter isn't that interesting for the tests because it'll have
timing issues...

  If that's the way, would somebody object to tests and the parameter
  being added now? The potential harm seems pretty low and the additional
  tests would cover the walsender interface that's not covered by any
  tests right now...
  
 beta is not the right time to add even a *minor* functionality. Let's do
 it for 9.5.

I am not interested in the feature iself at all. I am interested into
adding more tests for new functionality that's currently not tested in
an automated fassion. And I do think beta isn't a bad time for that. We
didn't *have* the infrastructure for the tests of binaries until the end
of the last CF, so I couldn't have added it during development.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


030_pg_recvlogical.pl
Description: Perl program

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-05-15 Thread Alvaro Herrera
Andres Freund wrote:
 On 2014-05-15 17:37:14 -0400, Alvaro Herrera wrote:
  Andres Freund wrote:
   On 2014-05-15 15:40:06 -0400, Robert Haas wrote:
On Thu, May 15, 2014 at 2:34 PM, Bruce Momjian br...@momjian.us wrote:
  
If the larger clog size is a show-stopper (and I'm not sure I have an
intelligent opinion on that just yet), one way to get around the
problem would be to summarize CLOG entries after-the-fact.  Once an
XID precedes the xmin of every snapshot, we don't need to know the
commit LSN any more.  So we could read the old pg_clog files and write
new summary files.  Since we don't need to care about subcommitted
transactions either, we could get by with just 1 bit per transaction,
1 = committed, 0 = aborted.  Once we've written and fsync'd the
summary files, we could throw away the original files.  That might
leave us with a smaller pg_clog than what we have today.
   
   I think the easiest way for now would be to have pg_clog with the same
   format as today and a rangewise much smaller pg_csn storing the lsns
   that are needed. That'll leave us with pg_upgrade'ability without
   needing to rewrite pg_clog during the upgrade.
  
  Err, we're proposing a patch to add timestamps to each commit,
  http://www.postgresql.org/message-id/2013101600.ge4...@eldon.alvh.no-ip.org
  which does so in precisely this way.
 
 I am not sure where my statements above conflict with committs?

I didn't say it did ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql: label regression test failed

2014-05-15 Thread Kohei KaiGai
Sorry, I've forgotten the report.

The test fails on label test come from specification change in the mcs policy.
Previously, it was applied to all the domains including unconfined_t, but now,
it became to be applied on the domain with mcsconstrained attribute.

This regression test run sepgsql_seton() on the system unconfined_t domain,
and see the behavior when process intended to move wider or narrower ranged
categories, so it was affected by system policy change, even though it is our
intention of sepgsql.

The attached patch adds mcsconstrained attribute on the domain for this
regression test, if this attribute exists. So, it will work on both of F20 and
older system.

Regarding to the regression test on ddl and alter, this change looks to me
hook invocation around recomputeNamespacePath() were gone, because
the schema already allowed to search was already checked.
Is the behavior around recomputeNamespacePath() recently updated?
At least, it is not a matter since {search} permission towards
regtest_schema_2 is checked in this test scenario.

Thanks,

2014-05-14 13:33 GMT+09:00 Sergey Muraviov sergey.k.murav...@gmail.com:
 Hi.

 Some regression tests for sepgsql still not work on Fedora 20:

 == running regression test queries==
 test label... FAILED
 test dml  ... ok
 test ddl  ... FAILED
 test alter... FAILED
 test misc ... ok

 ==
  3 of 5 tests failed.
 ==

 $ sestatus
 SELinux status: enabled
 SELinuxfs mount:/sys/fs/selinux
 SELinux root directory: /etc/selinux
 Loaded policy name: targeted
 Current mode:   enforcing
 Mode from config file:  enforcing
 Policy MLS status:  enabled
 Policy deny_unknown status: allowed
 Max kernel policy version:  29

 $ uname -i -o -r
 3.14.3-200.fc20.x86_64 x86_64 GNU/Linux

 $ /usr/local/pgsql/bin/postgres --version
 postgres (PostgreSQL) 9.4beta1

 PS
 I've got this compiler warning:
  relation.c: In function ‘sepgsql_relation_drop’:
 relation.c:472:25: warning: ‘tclass’ may be used uninitialized in this
 function [-Wmaybe-uninitialized]
   sepgsql_avc_check_perms(object,
  ^


 2013-12-25 0:34 GMT+04:00 Kohei KaiGai kai...@kaigai.gr.jp:

 Hello,

 It seems to me changes in the base security policy on Fedora affected to
 the regression test. Our test cases for sepgsql_setcon() utilizes the MCS
 rules, that prevents domain transition from narrow categories to wider
 ones,
 to control the success cases and failure cases.

 However, its coverage was changed. It was applied all the domains in the
 system, thus unconfined_t domain had been enforced by MCS rules.
 But now, it shall be applied only domains with mcs_constrained_type
 attribute.

 [kaigai@vmlinux tmp]$ diff -up old/policy/mcs new/policy/mcs
   :
  snip
   :
  mlsconstrain process { transition dyntransition }
 -   (( h1 dom h2 ) or ( t1 == mcssetcats ));
 +   (( h1 dom h2 ) or ( t1 != mcs_constrained_type ));

 Probably, we need to define a domain by ourselves for regression test to
 ensure
 the test stability, not using the system unconfined domain that has
 different
 meaning by release.

 I'll make a patch. Please wait for a while.

 Thanks for your test  reports.

 2013/12/18 Sergey Muraviov sergey.k.murav...@gmail.com:
  # semodule -l | grep sepgslq
  sepgsql-regtest 1.07
 
  Full list of modules is in attachment.
 
 
  2013/12/18 Kohei KaiGai kai...@kaigai.gr.jp
 
  Could you show me semodule -l on your environment?
  I believe security policy has not been changed between F19 and F20...
 
  Thanks,
 
  2013/12/18 Sergey Muraviov sergey.k.murav...@gmail.com:
   Hi
  
   I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora
   20
   and
   met with a label regression test failure.
  
   PS
   I've got some warning during build process.
  
   --
   Best regards,
   Sergey Muraviov
  
  
   --
   Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
   To make changes to your subscription:
   http://www.postgresql.org/mailpref/pgsql-hackers
  
 
 
 
  --
  KaiGai Kohei kai...@kaigai.gr.jp
 
 
 
 
  --
  Best regards,
  Sergey Muraviov



 --
 KaiGai Kohei kai...@kaigai.gr.jp




 --
 Best regards,
 Sergey Muraviov



-- 
KaiGai Kohei kai...@kaigai.gr.jp


sepgsql-fixup-regtest-policy.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE that result in repeated sorting of the data

2014-05-15 Thread Jon Nelson
On Thu, May 15, 2014 at 4:50 PM, David G Johnston
david.g.johns...@gmail.com wrote:
 Jon Nelson-14 wrote
 I was watching a very large recursive CTE get built today and this CTE
 involves on the order of a dozen or so loops joining the initial
 table against existing tables. It struck me that - every time through
 the loop the tables were sorted and then joined and that it would be
 much more efficient if the tables remained in a sorted state and could
 avoid being re-sorted each time through the loop. Am I missing
 something here? I am using PG 8.4 if that matters.

 I'm not sure what you mean by watching but maybe this is a simple as
 changing your CTE to use UNION ALL instead of UNION [DISTINCT]?

In fact, I'm using UNION ALL.

 If you really think it could be improved upon maybe you can help and provide
 a minimal self-contained example query and data that exhibits the behavior
 you describe so others can see it and test changes?  It would be nice to
 know if other versions than one that is basically no longer supported
 exhibits the same behavior.

Pretty much any CTE that looks like this:

with cte AS (
  select stuff from A
  UNION ALL
  select more_stuff from B, cte WHERE join conditions
) SELECT * FROM cte;

*and* where the planner chooses to join B and cte by sorting and doing
a merge join.

I'll see if I can come up with a self-contained example.


-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Error in running DBT2

2014-05-15 Thread Rohit Goyal
Hi All,

I am using centOS6 and after all confugration, I run the below command


*dbt2-run-workload -a pgsql -d 120 -w 1 -o /tmp/result -c 10*
*Error:*
Stage 3. Processing of results...
Killing client...
waiting for server to shut down done
server stopped
Traceback (most recent call last):
  File /tmp/dbt2/bin/dbt2-post-process, line 14, in module
import rpy2.robjects as robjects
ImportError: No module named rpy2.robjects
Test completed.
Results are in: /tmp/result

Please guide me !! :)

Regards,
Rohit Goyal


-- 
Regards,
Rohit Goyal


Re: [HACKERS] Error in running DBT2

2014-05-15 Thread Andrew Dunstan



On 05/15/2014 06:37 PM, Rohit Goyal wrote:

Hi All,

I am using centOS6 and after all confugration, I run the below command

*dbt2-run-workload -a pgsql -d 120 -w 1 -o /tmp/result -c 10
*
*Error:*
Stage 3. Processing of results...
Killing client...
waiting for server to shut down done
server stopped
Traceback (most recent call last):
  File /tmp/dbt2/bin/dbt2-post-process, line 14, in module
import rpy2.robjects as robjects
ImportError: No module named rpy2.robjects
Test completed.
Results are in: /tmp/result

Please guide me !! :)





Do these questions about running dbt2 even belong on pgsql-hackers? They 
seem to me to be usage questions that belong on pgsql-general.


They are also woefully inadequate in the detail they provide. I don't 
see how anyone could take the above report and give any sort of opinion, 
other than that it is a pythin error and not apparently a postgres error 
at all.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Error in running DBT2

2014-05-15 Thread Peter Geoghegan
On Thu, May 15, 2014 at 3:47 PM, Andrew Dunstan and...@dunslane.net wrote:
 Do these questions about running dbt2 even belong on pgsql-hackers? They
 seem to me to be usage questions that belong on pgsql-general.

I agree.

Anyway, perhaps the OP will have more luck with OLTPBenchmark, which
has some kind of TPC-C support:
http://oltpbenchmark.com/wiki/index.php?title=Main_Page


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Tomas Vondra
On 15.5.2014 22:56, Andrew Dunstan wrote:
 
 On 05/15/2014 04:30 PM, Stefan Kaltenbrunner wrote:

 well I'm not sure about about misconfigured but both my personal
 buildfarm members and pginfra run ones (like gaibasaurus) got errors
 complaining about snapshot too old in the past for long running tests
 so I'm not sure it is really a we never had machine with runs that
 long. So maybe we should not reject those submissions at submission
 time but rather mark them clearly on the dashboard and leave the final
 interpretation to a human...

 
 That's a LOT harder and more work to arrange. Frankly, there are more
 important things to do.
 
 I would like to know the circumstances of these very long runs. I drive
 some of my VMs pretty hard on pretty modest hardware, and they don't
 come close to running 24 hours.
 
 The current behaviour goes back to this commit from December 2011:
 
commit a8b5049e64f9cb08f8e165d0737139dab74e3bce
Author: Andrew Dunstan and...@dunslane.net
Date:   Wed Dec 14 14:38:44 2011 -0800
 
 Use git snapshot instead of fixed 10 day timeout.
 
 The sanity checks made sure that an animal wasn't submitting a
 snapshot that was too old. But sometimes an old branch doesn't
 get any changes for more than 10 days. So accept a snapshot that
 is not more than 1 day older than the last known snapshot. Per
 complaint from Stefan.
 
 
 I'm prepared to increase the sanity check time if there is a serious
 demand for it, but I'd like to know what to increase it to.

I doubt there's no one size fits all limit. If the machines running
recursive clobber tests need tens of days to complete the tests. then
that limit is pretty useless to most regular animals.

So what about keeping the current value for most animals, but allowing
an override for some selected ones? I'd expect this to be much simpler
to implement, and it shouldn't require any human intervention.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] buildfarm animals and 'snapshot too old'

2014-05-15 Thread Tomas Vondra
On 15.5.2014 22:07, Andrew Dunstan wrote:
 
 Yes, I've seen that. Frankly, a test that takes something like 500 
 hours is a bit crazy.

Maybe. It certainly is not a test people will use during development.
But if it can detect some hard-to-find errors in the code, that might
possibly lead to serious problems, then +1 from me to run them at least
on one animal. 500 hours is ~3 weeks, which is not that bad IMHO.

Also, once you know where it fails the developer can run just that
single test (which might take minutes/hours, but not days).

 If we really want to run this in the buildfarm we should probably
 try to create a massively cut down test schedule for use in this
 case.

If we can run cut this down in a meaningful way (i.e. without
sacrificing most of the benefits) then sure - let's do that. But I think
that's what CLOBBER_CACHE_ALWAYS is about.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Trigger concurrent execution

2014-05-15 Thread Blagoj Petrushev
Hi,

I'm thinking of an extension to trigger functionality like this:

CREATE TRIGGER trigger_name
AFTER event
ON table
CONCURRENTLY EXECUTE PROCEDURE trigger_fc

This would call the trigger after the end of the transaction.

The following is a use-case, please tell me if I'm doing it wrong.

I have a big table with big text column article and a nullable
tsvector column fts_article. On each insert or update that changes the
article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
daemon listener, I catch the notification and update fts_article
accordingly with my_fts_fc(article). The reason I don't do this
directly in my trigger is because my_fts_fc is slow for big articles,
fts_article has a gin index, and also, on heavy load, my listener can
do these updates concurrently. Now, with a concurrent execution of
triggers, I can just call my_fts_fc inside the trigger instead of the
notify roundtrip.

Kind regards,
Blagoj Petrushev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 release notes

2014-05-15 Thread Andres Freund
On 2014-05-04 08:46:07 -0400, Bruce Momjian wrote:
 I have completed the initial version of the 9.4 release notes.  You can
 view them here:
 
   http://www.postgresql.org/docs/devel/static/release-9-4.html
 
 I will be adding additional markup in the next few days.
 
 Feedback expected and welcomed.  I expect to be modifying this until we
 release 9.4 final.  I have marked items where I need help with question
 marks.

This time I started reading from the end. I think I've fixed most of the
questionable things (i.e. ? or FIXMEs) left.

I am not really sure how to rewrite the notes for the logical decoding
stuff into a more appropriate format for the release notes. Currently it
seems to describe too many details and not enough overview. It's also
probably too long.

How about letting it keep it's sect4 but remove the itemizedlist and
put a short explanation about the individual parts into a following
para or two? That'd require a name after a sect4 which normally
isn't done...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 18511f2e4cd7d72b8d943efd9e6501d3903a64c4 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Fri, 16 May 2014 01:37:07 +0200
Subject: [PATCH] Further 9.4 release notes improvements.

---
 doc/src/sgml/release-9.4.sgml | 52 +++
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/release-9.4.sgml b/doc/src/sgml/release-9.4.sgml
index 3070d0b..e143178 100644
--- a/doc/src/sgml/release-9.4.sgml
+++ b/doc/src/sgml/release-9.4.sgml
@@ -30,7 +30,7 @@
  listitem
   para
link linkend=logicaldecodingLogical decoding/link allows database
-   changes to be streamed out in customizable format
+   changes to be streamed out in a customizable format
   /para
  /listitem
 
@@ -298,6 +298,12 @@
  /para
 /listitem
 
+listitem
+ para
+  commandDISCARD ALL/ now also discards the states of sequences.
+ /para
+/listitem
+
/itemizedlist
 
   /sect2
@@ -1005,7 +1011,6 @@
/para
 
para
-!-- FIXME: drop? --
 This was added so views that select from a table with zero columns
 can be dumped correctly.
/para
@@ -1028,7 +1033,6 @@
/para
 
para
-!-- FIXME: compatibility break entry? --
 commandDISCARD ALL/ will now also discard such information.
/para
   /listitem
@@ -1199,6 +1203,11 @@
 AGGREGATE//link to supply the size of the aggregate's
 transition state data (Hadi Moshayedi)
/para
+
+   para
+This allows the plannet to better estimate how much memory will be
+used when aggregating.
+   /para
   /listitem
 
  /itemizedlist
@@ -1218,7 +1227,7 @@
 
   listitem
para
-Allow the changing of foreign key constraint  via link
+Allow the changing foreign key constraints's deferrability via link
 linkend=SQL-ALTERTABLEcommandALTER TABLE//link
 ... literalALTER CONSTRAINT/ (Simon Riggs)
/para
@@ -1254,7 +1263,7 @@
 
   listitem
para
-Fully-implement the link
+Fully implement the link
 linkend=datatype-linetypeline//link data type (Peter
 Eisentraut)
/para
@@ -1472,7 +1481,7 @@
para
 Add function link
 linkend=functions-admin-dblocationfunctionpg_filenode_relation()//link
-to allow for more efficient filenode to relation lookups (Andres
+to allow for more efficient lookups from filenode to relation (Andres
 Freund)
/para
   /listitem
@@ -1543,10 +1552,13 @@
   /listitem
 
   listitem
-   !-- FIXME --
para
 Allow polymorphic aggregates to have non-polymorphic state data
-types ? (Tom Lane)
+types (Tom Lane)
+   /para
+   para
+This allows to declare aggregates like the builtin
+functionarray_agg()/ from SQL.
/para
   /listitem
 
@@ -1772,8 +1784,8 @@
 
   listitem
para
-Allow field wrapping to applicationpsql/'s extended mode
-(Sergey Muraviov)
+Add ability to wrap long lines in applicationpsql/'s expanded
+mode by using command\pset format wrapped/ (Sergey Muraviov)
/para
   /listitem
 
@@ -2218,7 +2230,8 @@
   listitem
para
 Add link linkend=pgprewarmapplicationpg_prewarm//link
-to preload relation data into the shared buffer cache (Robert Haas)
+extension to preload relation data into the shared buffer cache
+(Robert Haas)
/para
 
para
@@ -2243,7 +2256,7 @@
 
   listitem
para
-Add logging of trigger execution to link
+Add option to include trigger execution time to link
 linkend=auto-explainapplicationauto_explain//link
 

Re: [HACKERS] Error in running DBT2

2014-05-15 Thread David G Johnston
Andrew Dunstan wrote
 On 05/15/2014 06:37 PM, Rohit Goyal wrote:
 Hi All,

 I am using centOS6 and after all confugration, I run the below command

 *dbt2-run-workload -a pgsql -d 120 -w 1 -o /tmp/result -c 10
 *
 *Error:*
 Stage 3. Processing of results...
 Killing client...
 waiting for server to shut down done
 server stopped
 Traceback (most recent call last):
   File /tmp/dbt2/bin/dbt2-post-process, line 14, in 
 module
 import rpy2.robjects as robjects
 ImportError: No module named rpy2.robjects
 Test completed.
 Results are in: /tmp/result

 Please guide me !! :)


 
 
 Do these questions about running dbt2 even belong on pgsql-hackers? They 
 seem to me to be usage questions that belong on pgsql-general.
 
 They are also woefully inadequate in the detail they provide. I don't 
 see how anyone could take the above report and give any sort of opinion, 
 other than that it is a pythin error and not apparently a postgres error 
 at all.

Actually, 

osdldbt-gene...@lists.sourceforge.net

would be the proper location and I do see that the OP has found said mailing
list and even got a response a few days ago.

I guess it makes some sense to shout help from the rooftops but likely
anyone willing and able to personal training to a DBT2 newbie is going to be
monitoring the aforementioned list.

One other suggestion is, after seeing such an error as above, actually
telling people that you have made some rudimentary attempt to confirm that,
indeed you have:

1) Python installed
2) R installed
3) The python module rpy2 installed

I am thinking centOS6 doesn't have a packaging system that ensures that all
dependencies are present and so it seems that the user needs to do so and
affirm that such is the case when asking for help.

Another helpful outcome of trying to understand what people need to help you
is that you would realize that the message Results are in: /tmp/result
indicates that there is an additional file that someone debugging (you or
someone on these mailing lists) would likely find helpful to review.  Attach
that file - either in full or in part - or at least indicate you looked at
it and didn't see anything unusual (unlikely that, though).

Sorry I cannot help you directly but maybe I can help you be more successful
in asking for help in the future :)

Good Luck!

David J.

IOW: Guiding someone through debugging on an mailing list is very tedious
and difficult for the people doing the guiding - do as much exploring as
possible on your own and tell people where you have already been and what
you have seen.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Error-in-running-DBT2-tp5804147p5804161.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 release notes

2014-05-15 Thread David G Johnston
Andres Freund-3 wrote
 On 2014-05-04 08:46:07 -0400, Bruce Momjian wrote:
 I have completed the initial version of the 9.4 release notes.  You can
 view them here:
 
  http://www.postgresql.org/docs/devel/static/release-9-4.html
 
 I will be adding additional markup in the next few days.
 
 Feedback expected and welcomed.  I expect to be modifying this until we
 release 9.4 final.  I have marked items where I need help with question
 marks.
 
 This time I started reading from the end. I think I've fixed most of the
 questionable things (i.e. ? or FIXMEs) left.
 
 I am not really sure how to rewrite the notes for the logical decoding
 stuff into a more appropriate format for the release notes. Currently it
 seems to describe too many details and not enough overview. It's also
 probably too long.
 
 How about letting it keep it's 
 sect4
  but remove the 
 itemizedlist
  and
 put a short explanation about the individual parts into a following
 para
  or two? That'd require a name after a 
 sect4
  which normally
 isn't done...
 
 Greetings,
 
 Andres Freund

Some errors and suggestions - my apologizes for the format as I do not have
a proper patching routine setup.

Patch Review - Top to Bottom (mostly, I think...)

s/constraints's/constraint/ - possessive not needed here, it is a property
of the constraint, not owned by it.

s/plannet/planner

commandDISCARD ALL/ now also discards the states of sequences.
change to
commandDISCARD ALL/ now also discards sequence state.

IIUC: Logical decoding allows for streaming of statement-scoped database
changes.

Add function pg_filenode_relation() to more efficiently lookup relations via
their filenode.

This allows one to declare array_agg()-like aggregates using SQL.

IIUC: Remove line length restrictions from pgbench.


These are not in the patch but from my quick scan of the online release
notes - top to bottom:

then conditionally additional adjacent whitespace if not in FX mode
-
then, conditionally, remove additional adjacent whitespace if not in FX
mode.
(I presume those conditions are noted in the documentation somewhere)

For example, previously format string space-space-space would consume only
the first space in ' 12', while it will not consume all three characters.
- 
For example, the format string space-space-space previously consumed only
the first space in 'space-space-12' whereas now it will consume all three
characters.

style: add comma - Previously[,] empty arrays were returned (occurs
frequently but is minor)

style: NULL VARIADIC function arguments are now disallowed
-
Disallow NULL VARIADIC function arguments
(most of the notes are verb-leading in structure)

During immediate shutdown, send uncatchable termination - kill the comma

In contrast to local_preload_libraries, this parameter can load any shared
library - shoot the comma

The linking on this one is odd:
Have Windows ASCII-encoded databases and server process (e.g. postmaster)
emit messages in the LC_CTYPE-defined language (Alexander Law, Noah Misch)


Add ROWS FROM() syntax to allow horizontal concatenation of set-returning
functions in the FROM-clause (Andrew Gierth)
- Maybe a note about using this to avoid least-common-multiple expansion?

Add WITH ORDINALITY syntax which numbers rows returned from FROM-clause
functions
- 
Add WITH ORDINALITY syntax to number the rows returned by FROM-clause
functions.

???
DISCARD ALL will now also discard such information.
-
DISCARD ALL now also invokes this command.


be converted to NULL in in CSV mode - strike one of the ins


[I got no clue on this pair... but recommend someone rewrite it]
Improve the internal definition of system relations (Andres Freund, Robert
Haas)
Previously, relations once moved into the system catalog schema could no
longer be modified or dropped.


David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/9-4-release-notes-tp5802343p5804163.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Trigger concurrent execution

2014-05-15 Thread David G Johnston
Blagoj Petrushev wrote
 Hi,
 
 I'm thinking of an extension to trigger functionality like this:
 
 CREATE TRIGGER trigger_name
 AFTER event
 ON table
 CONCURRENTLY EXECUTE PROCEDURE trigger_fc
 
 This would call the trigger after the end of the transaction.
 
 The following is a use-case, please tell me if I'm doing it wrong.
 
 I have a big table with big text column article and a nullable
 tsvector column fts_article. On each insert or update that changes the
 article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
 daemon listener, I catch the notification and update fts_article
 accordingly with my_fts_fc(article). The reason I don't do this
 directly in my trigger is because my_fts_fc is slow for big articles,
 fts_article has a gin index, and also, on heavy load, my listener can
 do these updates concurrently. Now, with a concurrent execution of
 triggers, I can just call my_fts_fc inside the trigger instead of the
 notify roundtrip.

Conceptually, trigger actions run in-transaction and can cause it to
ROLLBACK; so how would after the end of the transaction work?  Since the
easy way is to have COMMIT; block until all the AFTER event concurrent
triggers fire I presume you would want something more like a task queue for
background workers where, at commit, the function call is in place in a FIFO
queue and the calling session is allowed to move onto other activity.

It is not clear what you mean by my listener can do these updates
concurrently? Concurrently with each other or concurrently with other DML
action on table?I assume you have multiple listeners since the potential
rate of insert of the documents is likely much greater than the rate of
update/indexing.

Also, it would seem you'd typically want the GIN index to be updated once
the corresponding transaction committed and makes the rest of the data
available.  Or does your use case allow for some delay between the article
being in the database physically and it being available in the index?

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Trigger-concurrent-execution-tp5804158p5804164.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.0 PDF build broken

2014-05-15 Thread Peter Eisentraut
Mysteriously, commit 6b2a1445ec8a631060c4cbff3f172bf31d3379b9 has broken
the PDF build (openjade + pdfjadetex) in the 9.0 branch only.  The error
is

[256.0.28
! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than \pd
fstartlink.
\AtBegShi@Output ...ipout \box \AtBeginShipoutBox 
  \fi \fi 
l.241723 ...char95{}stat\char95{}file('filename');
  
!  == Fatal error occurred, no output PDF file produced!
Transcript written on postgres-A4.log.
make: *** [postgres-A4.pdf] Error 1


I have reproduced this on two different platforms, and it affects only
this branch.  I guess this change might have caused the page boundaries
to shift in an unfortunate way.  I seem to recall we have had similar
problems before.  Does anyone remember?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.0 PDF build broken

2014-05-15 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Mysteriously, commit 6b2a1445ec8a631060c4cbff3f172bf31d3379b9 has broken
 the PDF build (openjade + pdfjadetex) in the 9.0 branch only.  The error
 is

 [256.0.28
 ! pdfTeX error (ext4): \pdfendlink ended up in different nesting level than 
 \pd
 fstartlink.

Oh, not that again.

 I have reproduced this on two different platforms, and it affects only
 this branch.  I guess this change might have caused the page boundaries
 to shift in an unfortunate way.  I seem to recall we have had similar
 problems before.  Does anyone remember?

Yeah.  This is caused by a hyperlink whose displayed text crosses a page
boundary.  The only known fix is to change the text enough so the link
no longer runs across a page boundary.  Unfortunately, pdfTeX is pretty
unhelpful about identifying exactly where the problem is.  I seem to
recall having posted a recipe about finding such problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.0 PDF build broken

2014-05-15 Thread Tom Lane
I wrote:
 Yeah.  This is caused by a hyperlink whose displayed text crosses a page
 boundary.  The only known fix is to change the text enough so the link
 no longer runs across a page boundary.  Unfortunately, pdfTeX is pretty
 unhelpful about identifying exactly where the problem is.  I seem to
 recall having posted a recipe about finding such problems.

Ah, found it:
http://www.postgresql.org/message-id/9473.1296172...@sss.pgh.pa.us

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql \db+ lack of size column

2014-05-15 Thread Fabrízio de Royes Mello
Hi all,

Are there some reason to don't show the tablespace size in the \db+ psql
command?

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


[HACKERS] BUGFIX: Dynamic bgworkers with BGW_NEVER_RESTART worker restarted on FatalError

2014-05-15 Thread Craig Ringer
Hi all

There's a bug in the dynamic bgworkers code that I think needs fixing
before release. TL;DR: BGW_NO_RESTART workers are restarted after
postmaster crash, attached patch changes that.

The case that's triggering the issue is where a static bgworker is
registering a new dynamic bgworker to do some setup work each time it
starts. The static worker is relaunched on postmaster restart and
registers a new BGW_NO_RESTART dynamic bgworker. This dynamic bgworker
immediately hits an Assert and dies.

This *should* cause a postmaster restart loop; that's expected. What it
shouldn't do, but is doing, is restart multiple copies of the bgworker -
fail to purge the old BGW_NO_RESTART one and launch it as if it'd never
crashed.

The attached patch fixes the problem.



Detail:

If you set a worker as BGW_NO_RESTART it isn't restarted if it ERRORs
out. That's fine.

With Petr's applied patch it no longer restarts on exit 0 (normal exit)
either.

There's a third case, though: a bgworker crash causing postmaster
restart. In this case the bgworker is still restarted, which makes no
sense at all if it isn't for the other two cases.

The existing code looks like it tries to protect against this - in
maybe_start_bgworker, the invocation of do_start_bgworker is protected
by a prior test for rw-rw_crashed_at that, if
rw-rw_worker.bgw_restart_time == BGW_NEVER_RESTART, unregisters the
worker and skips to the next one.

However, in my testing a breakpoint inside the if (rw-rw_crashed_at !=
0) test in maybe_start_bgworker is never hit, even for a bgworker that
is known to have crashed. rw-rw_crashed_at is always zero.

The culprit is ResetBackgroundWorkerCrashTimes, which unconditionally
resets the crash time without considering that the worker might be
BGW_NO_RESTART.

The attached patch makes ResetBackgroundWorkerCrashTimes only reset the
crashed time for workers with a restart time set.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From a48de97f593b17160487e67f953c8d2b25f9e98a Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Fri, 16 May 2014 13:29:16 +0800
Subject: [PATCH] Don't restart BGW_NO_RESTART workers after postmaster crash

ResetBackgroundWorkerCrashTimes was ignoring BGW_NO_RESTART and clearing the
crash time of all bgworkers on postmaster restart so they'd be restarted as
soon as the postmaster was up and ready. It should only do this for workers
that are supposed to be restarted, as a BGW_NO_RESTART worker with no
rw_crashed_at set is assumed to be newly registered.

As a result, if a bgworker registered another BGW_NO_RESTART bgworker
during postmaster restart we'd get an increasing number of duplicates
of the BGW_NO_RESTART worker every time the postmaster restarted.
---
 src/backend/postmaster/bgworker.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 85a3b3a..4bbebb6 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -411,7 +411,8 @@ ResetBackgroundWorkerCrashTimes(void)
 		RegisteredBgWorker *rw;
 
 		rw = slist_container(RegisteredBgWorker, rw_lnode, iter.cur);
-		rw-rw_crashed_at = 0;
+		if (rw-rw_worker.bgw_restart_time != BGW_NEVER_RESTART)
+			rw-rw_crashed_at = 0;
 	}
 }
 
-- 
1.9.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Trigger concurrent execution

2014-05-15 Thread Craig Ringer
On 05/16/2014 08:06 AM, Blagoj Petrushev wrote:
 Hi,
 
 I'm thinking of an extension to trigger functionality like this:
 
 CREATE TRIGGER trigger_name
 AFTER event
 ON table
 CONCURRENTLY EXECUTE PROCEDURE trigger_fc
 
 This would call the trigger after the end of the transaction.

If after the end of the transaction is what you mean by
concurrently, then that's the wrong word to choose.

AFTER COMMIT ?

The concept of running a trigger concurrently just doesn't make sense
in PostgreSQL, because the backend is single threaded. You wouldn't be
able to run any SQL commands until the trigger finished.

It isn't possible to do anything useful without a transaction, so
PostgreSQL would need to start a transaction for the trigger and commit
the transaction at the end, as if you'd run SELECT my_procedure();.
Because it's outside the scope of the transaction it probably wouldn't
be possible to do FOR EACH ROW with a NEW and OLD var, unless you
stashed them as materialized rows in the queue of pending AFTER COMMIT
triggers.

Finally, because it's after transaction commit, you couldn't easily
guarantee that the trigger would really run. If the backend crashed /
the server was shut down  / etc after the commit but before your trigger
finished, you'd have a committed transaction but the trigger would not
run. To fix that you'd need to somehow make the trigger queue WAL-logged
and run it during replay, which from my rather limited understanding of
this area would be ... interesting to do. It'd also mean the trigger
couldn't have any session context.

This isn't easy, if it's practical at all.

 I have a big table with big text column article and a nullable
 tsvector column fts_article. On each insert or update that changes the
 article, I trigger-issue 'NOTIFY article_changed row_id', then, with a
 daemon listener, I catch the notification and update fts_article
 accordingly with my_fts_fc(article). The reason I don't do this
 directly in my trigger is because my_fts_fc is slow for big articles,
 fts_article has a gin index, and also, on heavy load, my listener can
 do these updates concurrently. Now, with a concurrent execution of
 triggers, I can just call my_fts_fc inside the trigger instead of the
 notify roundtrip.

I don't think that really fits.

It seems like you want to run the trigger procedure in the background on
another back-end. That'd be quite cool, but also not trivial to do,
especially if you wanted to guarantee that it happened reliably and in a
crash-safe manner.



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers