Re: [HACKERS] Fixing pg_basebackup with tablespaces found in $PGDATA

2014-01-05 Thread Craig Ringer
On 01/02/2014 06:53 AM, Dimitri Fontaine wrote:
 As much as I've seen people frown upon $subject, it still happens in the
 wild

I met a new case of it a couple of weeks ago, so I can certainly confirm
that.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for pg_stat_archiver view

2014-01-05 Thread Magnus Hagander
On Sat, Jan 4, 2014 at 2:01 PM, Gabriele Bartolini 
gabriele.bartol...@2ndquadrant.it wrote:

 Il 04/01/14 13:25, Magnus Hagander ha scritto:
  With those two, I think it would make much sense to have a view like
  this.

 Ok, I will prepare version 2 with those.



  Oh, and you need to change the format id number of the stats file.

 I have not found any instruction on how to set it. I assume you are
 talking about this:

 PGSTAT_FILE_FORMAT_ID0x01A5BC9B

 Any suggestion is welcome.


Yes, that's what I'm talking about. And just increment it by 1.

Not sure where the original value came from, but that's what people have
been doing recently.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Craig Ringer
On 01/04/2014 01:06 AM, Claudio Freire wrote:
 You're forgetting ORM workloads.

I'm impressed that you've come up with an area where ORMs are beneficial ;-)

JDBC also has a statement batching interface. Right now PgJDBC just
unwraps the batch and runs each query individually. Any async-support
improvements server-side should probably consider the need of executing
a batch. The batch might be one PreparedStatement with many different
parameters, or it might be a series of unrelated statements. A way for
PgJDBC to run the batch without syncing with the server after each query
would be really helpful.

So would a way to BIND an array of parameters, so we could execute a
prepared statmenet once with multiple parameters and then sync up with
the server after all executions.

As for ORMs benefitting from this: Remember that nPgSQL and PgJDBC don't
use libpq. So the libpq changes would only help ORMs based on things
like Python (psycopg2), Ruby (Pg gem), etc, where they're using libpq
wrapper drivers.

 Execute-many of prepared statements is another one, quite common.

That's the case I'd really love to see proper server-side batch support
for. BIND_MULTIPLE, EXECUTE.

 I'm not sure what would happen if one of the queries returned an
 error. If in a transaction, all the following queries would error out
 I'd imagine. If not, they would simply be executed blindly.. am I
 correct?

It's not just dealing with erroring out. Many ORMs look at the count of
rows affected to detect whether an operation conflicted with another
concurrent operation when optimistic concurrency control is in use. E.g.

UPDATE t SET x = 'fred' WHERE rowversion = 4;

will be seen to fail if it reports that it's affected zero rows. This
is one of the reasons ORM users have such serious problems with the
write side of our partitioning support - we discard affected row counts,
and many ORMs don't deal well with that.

At least in JDBC, executeBatch returns an array of rowcounts. So you
can't throw away affected row counts when running batches, they must be
returned to the client. Doesn't matter if it's a single protocol message
with a list of IDs, or a series of individual messages, though.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Craig Ringer
On 01/04/2014 01:22 AM, Merlin Moncure wrote:
 Long term, I'd rather see an optimized 'ORM flush' assemble the data
 into a structured data set (perhaps a JSON document) and pass it to
 some receiving routine that decomposed it into records.

The same is true on the input side. I'd much rather be sending an ORM
client a big JSON / YAML / whatever graph than a horrible,
duplication-filled chained LEFT JOIN projection like they currently rely
on. When they're not just doing n+1 selects, which is worse.

I think that's really a side-issue though. ORMs aren't going to change
in a hurry, and batching / fire-and-forget support is good for all sorts
of other jobs too.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Greg Stark
On Fri, Jan 3, 2014 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I think Florian has a good point there, and the reason is this: what
 you are talking about will be of exactly zero use to applications that
 want to see the results of one query before launching the next.

There are techniques for handling that actually. For a better
explanation than I can do see
http://kentonv.github.io/capnproto/rpc.html. It's mostly a language
feature but it does require support from the protocol to be able to
reference data in earlier responses in your subsequent requests.

You can pull this trick by having an RPC server near the database so
all the latency is handled by the RPC server and not the database
connection. But it seems to me that that's unnecessary complication
and shouldn't be necessary. The database protocol basically is an RPC
layer.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Florian Weimer

On 01/05/2014 03:11 PM, Greg Stark wrote:

On Fri, Jan 3, 2014 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:

I think Florian has a good point there, and the reason is this: what
you are talking about will be of exactly zero use to applications that
want to see the results of one query before launching the next.


There are techniques for handling that actually. For a better
explanation than I can do see
http://kentonv.github.io/capnproto/rpc.html. It's mostly a language
feature but it does require support from the protocol to be able to
reference data in earlier responses in your subsequent requests.


Purely from a perspective of latency avoidance, sub-queries, WITH or 
stored procedures can achieve the same thing, and work even if the 
intermediate result has to undergo some transformation. :-)


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread Florian Weimer

On 01/04/2014 04:39 PM, Martijn van Oosterhout wrote:


Why switch between COPY commands, why could you not do it in one? For
example:

COPY table1(col1, col2, ...),
  table2(col1, col2, ...)
FROM STDIN WITH (tableoids);
tableoid1tabcol1tabcol2...
tableoid2tab...
...
\.


My originally idea was to avoid any server-side changes, so that 
applications do not need fallback code for talking to old servers.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread Robert Haas
On Sat, Jan 4, 2014 at 3:27 PM, knizhnik knizh...@garret.ru wrote:
 1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic
 shared memory), like 9.2, 9.3.1,...

Yeah.  If it's loaded at postmaster start time, then it can work with
any version.  On 9.4+, you could possibly make it work even if it's
loaded on the fly by using the dynamic shared memory facilities.
However, there are currently some limitations to those facilities that
make some things you might want to do tricky.  There are pending
patches to lift some of these limitations.

 2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash,
 hash_search,...)
 May be I missed something - I just noticed DSM and have no chance to
 investigate it, but looks like hash table can not be allocated in DSM...

It wouldn't be very difficult to write an analog of ShmemInitHash() on
top of the dsm_toc patch that is currently pending.  A problem,
though, is that it's not currently possible to put LWLocks in dynamic
shared memory, and even spinlocks will be problematic if
--disable-spinlocks is used.  I'm due to write a post about these
problems; perhaps I should go do that.

 3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have
 to provide own allocator (although creation of non-releasing memory
 allocator should not be a big issue).

The dsm_toc infrastructure would solve this problem.

 4. Current implementation of DSM still suffers from 256Gb problem. Certainly
 I can create multiple segments and so provide workaround without using huge
 pages, but it complicates allocator.

So it sounds like DSM should also support huge pages somehow.  I'm not
sure what that should look like.

 5. I wonder if I dynamically add new DSM segment - will it be available for
 other PostgreSQL processes? For example I run query which loads data in IMCS
 and so needs more space and allocates new DSM segment. Then another query is
 executed by other PostgreSQL process which tries to access this data. This
 process is not forked from the process created this new DSM segment, so I do
 not understand how this segment will be mapped to the address space of this
 process, preserving address... Certainly I can prohibit dynamic extension of
 IMCS storage (hoping that in this case there will be no such problem with
 DSM). But in this case we will loose the main advantage of using DSM instead
 of old schema of plugin's private shared memory.

You can definitely dynamically add a new DSM segment; that's the point
of making it *dynamic* shared memory.  What's a bit tricky as things
stand today is making sure that it sticks around.  The current model
is that the DSM segment is destroyed when the last process unmaps it.
It would be easy enough to lift that limitation on systems other than
Windows; we could just add a dsm_keep_until_shutdown() API or
something similar.  But on Windows, segments are *automatically*
destroyed *by the operating system* when the last process unmaps them,
so it's not quite so clear to me how we can allow it there.  The main
shared memory segment is no problem because the postmaster always has
it mapped, even if no one else does, but that doesn't help for dynamic
shared memory segments.

 6. IMCS has some configuration parameters which has to be set through
 postgresql.conf. So in any case user has to edit postgresql.conf file.
 In case of using DSM it will be not necessary to add IMCS to
 shared_preload_libraries list. But I do not think that it is so restrictive
 and critical requirement, is it?

I don't really see a problem here.  One of the purposes of dynamic
shared memory (and dynamic background workers) is precisely that you
don't *necessarily* need to put extensions that use shared memory in
shared_preload_libraries - or in other words, you can add the
extension to a running server without restarting it.  If you know in
advance that you will want it, you probably still *want* to put it in
shared_preload_libraries, but part of the idea is that we can get away
from requiring that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread james

On 05/01/2014 16:50, Robert Haas wrote:

  But on Windows, segments are*automatically*
destroyed*by the operating system*  when the last process unmaps them,
so it's not quite so clear to me how we can allow it there.  The main
shared memory segment is no problem because the postmaster always has
it mapped, even if no one else does, but that doesn't help for dynamic
shared memory segments.

Surely you just need to DuplicateHandle into the parent process?  If you
want to (tidily) dispose of it at some time, then you'll need to tell the
postmaster that you have done so and what the handle is in its process,
but if you just want it to stick around, then you can just pass it up.



[HACKERS] dynamic shared memory and locks

2014-01-05 Thread Robert Haas
One of the things that you might want to do with dynamic shared memory
is store a lock in it.  In fact, my bet is that almost everything that
uses dynamic shared memory will want to do precisely that, because, of
course, it's dynamic *shared* memory, which means that it is
concurrently accessed by multiple processes, which tends to require
locking.  Typically, what you're going to want are either spinlocks
(for very short critical sections) or lwlocks (for longer ones).  It
doesn't really make sense to talk about storing heavyweight locks in
dynamic shared memory, because we're talking about storing locks with
the data structures that they protect, and heavyweight locks are used
to protect database or shared objects, not shared memory structures.
Of course, someone might think of trying to provide a mechanism for
the heavyweight lock manager to overflow to dynamic shared memory, but
that's a different thing altogether and not what I'm talking about
here.

Right now, storing spinlocks in dynamic shared memory *almost* works,
but there are problems with --disable-spinlocks.  In that
configuration, we use semaphores to simulate spinlocks.  Every time
someone calls SpinLockInit(), it's going to allocate a new semaphore
which will never be returned to the operating system, so you're pretty
quickly going to run out.  There are a couple of things we could do
about this:

1. Decide we don't care.  If you compile with --disable-spinlocks, and
then you try to use dynamic shared memory, it's going to leak
semaphores until none remain, and then start failing from there until
the postmaster is restarted.  If you don't like that, provide a
working spinlock implementation for your platform.

2. Forbid the use of dynamic shared memory when compiling with
--disable-spinlocks.  This is a more polite version of #1.  It seems
likely to me that nearly every piece of code that uses dynamic shared
memory will require locking.  Instead of letting people allocate
dynamic shared memory segments anyway and then having them start
failing shortly after postmaster startup, we could just head the
problem off at the pass by denying the request for dynamic shared
memory in the first place.  Dynamic shared memory allocation can
always fail (e.g. because we're out of memory) and also has an
explicit off switch that will make all requests fail
(dynamic_shared_memory_type=none), so any code that uses dynamic
shared memory has to be prepared for a failure at that point, whereas
a failure in SpinLockInit() might be more surprising.

3. Provide an inverse for SpinLockInit, say SpinLockDestroy, and
require all code written for dynamic shared memory to invoke this
function on every spinlock before the shared memory segment is
destroyed.  I initially thought that this could be done using the
on_dsm_detach infrastructure, but it turns out that doesn't really
work.  The on_dsm_detach infrastructure is designed to make sure that
you *release* all of your locks when detaching - i.e. those hooks get
invoked for each process that detaches.  For this, you'd need an
on_dsm_final_detach callback that gets called only for the very last
detach (and after prohibiting any other processes from attaching).  I
can certainly engineer all that, but it's a decent amount of extra
work for everyone who wants to use dynamic shared memory to write the
appropriate callback, and because few people actually use
--disable-spinlocks, I think those callbacks will tend to be rather
lightly tested and thus a breeding ground for marginal bugs that
nobody's terribly excited about fixing.

4. Drop support for --disable-spinlocks.

For what it's worth, my vote is currently for #2.  I can't think of
many interesting to do with dynamic shared memory without having at
least spinlocks, so I don't think we'd be losing much.  #1 seems
needlessly unfriendly, #3 seems like a lot of work for not much, and
#4 seems excessive at least as a solution to this particular problem,
though there may be other arguments for it.  What do others think?

I think we're also going to want to be able to create LWLocks in
dynamic shared memory: some critical sections won't be short enough or
safe enough to be protected by spinlocks.  At some level this seems
easy: change LWLockAcquire and friends to accept an LWLock * rather
than an LWLockId, and similarly change held_lwlocks[] to hold LWLock
pointers rather than LWLockIds.  One tricky point is that you'd better
try not to detach a shared memory segment while you're holding lwlocks
inside that segment, but I think just making that a coding rule
shouldn't cause any great problem, and conversely you'd better release
all lwlocks in the segment before detaching it, but this seems mostly
OK: throwing an error will call LWLockReleaseAll before doing the
resource manager cleanups that will unmap the dynamic shared memory
segment, so that's probably OK too.  There may be corner cases I
haven't thought about, though.  A bigger problem is that I think we
want to avoid 

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread Robert Haas
On Sun, Jan 5, 2014 at 12:34 PM, james ja...@mansionfamily.plus.com wrote:
 On 05/01/2014 16:50, Robert Haas wrote:

  But on Windows, segments are *automatically*
 destroyed *by the operating system* when the last process unmaps them,
 so it's not quite so clear to me how we can allow it there.  The main
 shared memory segment is no problem because the postmaster always has
 it mapped, even if no one else does, but that doesn't help for dynamic
 shared memory segments.

 Surely you just need to DuplicateHandle into the parent process?  If you
 want to (tidily) dispose of it at some time, then you'll need to tell the
 postmaster that you have done so and what the handle is in its process,
 but if you just want it to stick around, then you can just pass it up.

Uh, I don't know, maybe?  Does the postmaster have to do something to
receive the duplicated handle, or can the child just throw it over the
wall to the parent and let it rot until the postmaster finally exits?
The latter would be nicer for our purposes, perhaps, as running more
code from within the postmaster is risky for us.  If a regular backend
process dies, the postmaster will restart everything and the database
will come back on line, but if the postmaster itself dies, we're hard
down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic shared memory and locks

2014-01-05 Thread Andres Freund
On 2014-01-05 12:56:05 -0500, Robert Haas wrote:
 Right now, storing spinlocks in dynamic shared memory *almost* works,
 but there are problems with --disable-spinlocks.  In that
 configuration, we use semaphores to simulate spinlocks.  Every time
 someone calls SpinLockInit(), it's going to allocate a new semaphore
 which will never be returned to the operating system, so you're pretty
 quickly going to run out.  There are a couple of things we could do
 about this:

 4. Drop support for --disable-spinlocks.

I very strongly vote 4). I think we're going to hit this more and more
often and it's a facility that benefits almost nobody. Just about every
new platform will be/is on gcc or clang and you can just duplicate the
compiler provided generic implementation we have for arm for there.

The atomics implementation make this an automatic fallback if there's
no compiler specific variant around.

 I think we're also going to want to be able to create LWLocks in
 dynamic shared memory: some critical sections won't be short enough or
 safe enough to be protected by spinlocks.

Agreed.

 At some level this seems  easy: change LWLockAcquire and friends to
 accept an LWLock * rather than an LWLockId, and similarly change
 held_lwlocks[] to hold LWLock pointers rather than LWLockIds.

My primary reason isn't dsm TBH but wanting to embed the buffer lwlocks
in the bufferdesc, on the same cacheline as the buffer headers
spinlock. All the embedded ones can be allocated without padding, while
the relatively low number of non-embedded ones can be padded to the full
cacheline size.

 A bigger problem is that I think we
 want to avoid having a large amount of notational churn.  The obvious
 way to do that is to get rid of the LWLockId array and instead declare
 each fixed LWLock separately as e.g. LWLock *ProcArrayLock.  However,
 creating a large number of new globals that will need to be
 initialized in every new EXEC_BACKEND process seems irritating.  So
 maybe a better idea is to do something like this:

 #define BufFreelistLock (fixedlwlocks[0])
 #define ShmemIndexLock (fixedlwlocks[1])
 ...
 #define AutoFileLock (fixedlwlocks[36])
 #define NUM_FIXED_LWLOCKS 37
 
 Comments, suggestions?

My idea here was to just have two APIs, a legacy one that works like the
current one, and a new one that locks the lwlocks passed in by a
pointer. After all, LWLockAssign() which you use in extensions currently
returns a LWLockId. Seems ugly to turn that into a pointer.

But perhaps your idea is better anyway, no matter the hackery of turning
LWLockId into a pointer.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for pg_stat_archiver view

2014-01-05 Thread Gabriele Bartolini
Il 05/01/14 13:52, Magnus Hagander ha scritto:
 Yes, that's what I'm talking about. And just increment it by 1.
Done. I am attaching version 2 of the patch, which now implements only
one function (pg_stat_get_archiver()) and adds:

* failed attempts
* WAL of the last failed attempt
* time of the last failed attempt

Thanks for your inputs.

Ciao,
Gabriele

-- 
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 4ec6981..0094c19 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -270,6 +270,14 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
  /row
 
  row
+  
entrystructnamepg_stat_archiver/indextermprimarypg_stat_archiver/primary/indexterm/entry
+  entryOne row only, showing statistics about the
+   WAL archiver process's activity. See
+   xref linkend=pg-stat-archiver-view for details.
+ /entry
+ /row
+
+ row
   
entrystructnamepg_stat_bgwriter/indextermprimarypg_stat_bgwriter/primary/indexterm/entry
   entryOne row only, showing statistics about the
background writer process's activity. See
@@ -648,6 +656,64 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
/para
   /note
 
+  table id=pg-stat-archiver-view xreflabel=pg_stat_archiver
+   titlestructnamepg_stat_archiver/structname View/title
+
+   tgroup cols=3
+thead
+row
+  entryColumn/entry
+  entryType/entry
+  entryDescription/entry
+ /row
+/thead
+
+tbody
+ row
+  entrystructfieldarchived_wals//entry
+  entrytypebigint/type/entry
+  entryNumber of WAL files that have been successfully archived/entry
+ /row
+ row
+  entrystructfieldlast_archived_wal//entry
+  entrytypetext/type/entry
+  entryName of the last successfully archived WAL file/entry
+ /row
+ row
+  entrystructfieldlast_archived_wal_time//entry
+  entrytypetimestamp with time zone/type/entry
+  entryTime of the last successful archival operation/entry
+ /row
+ row
+  entrystructfieldfailed_attempts//entry
+  entrytypebigint/type/entry
+  entryNumber of failed attempts for archiving WAL files/entry
+ /row
+ row
+  entrystructfieldlast_failed_wal//entry
+  entrytypetext/type/entry
+  entryName of the WAL file of the last failed archival operation/entry
+ /row
+ row
+  entrystructfieldlast_failed_wal_time//entry
+  entrytypetimestamp with time zone/type/entry
+  entryTime of the last failed archival operation/entry
+ /row
+ row
+  entrystructfieldstats_reset//entry
+  entrytypetimestamp with time zone/type/entry
+  entryTime at which these statistics were last reset/entry
+ /row
+/tbody
+/tgroup
+  /table
+
+  para
+   The structnamepg_stat_archiver/structname view will always have a
+   single row, containing data about the archiver process of the cluster.
+  /para
+
+
   table id=pg-stat-bgwriter-view xreflabel=pg_stat_bgwriter
titlestructnamepg_stat_bgwriter/structname View/title
 
@@ -1613,6 +1679,8 @@ postgres: replaceableuser/ replaceabledatabase/ 
replaceablehost/ re
argument (requires superuser privileges).
Calling literalpg_stat_reset_shared('bgwriter')/ will zero all the
counters shown in the structnamepg_stat_bgwriter/ view.
+   Calling literalpg_stat_reset_shared('archiver')/ will zero all the
+   counters shown in the structnamepg_stat_archiver/ view.
   /entry
  /row
 
diff --git a/src/backend/catalog/system_views.sql 
b/src/backend/catalog/system_views.sql
index 575a40f..5ea8c87 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -672,6 +672,17 @@ CREATE VIEW pg_stat_xact_user_functions AS
 WHERE P.prolang != 12  -- fast check to eliminate built-in functions
   AND pg_stat_get_xact_function_calls(P.oid) IS NOT NULL;
 
+CREATE VIEW pg_stat_archiver AS
+SELECT
+s.archived_wals,
+s.last_archived_wal,
+s.last_archived_wal_time,
+s.failed_attempts,
+s.last_failed_wal,
+s.last_failed_wal_time,
+s.stats_reset
+FROM pg_stat_get_archiver() s;
+
 CREATE VIEW pg_stat_bgwriter AS
 SELECT
 pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 2bb572e..60f957c 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -36,6 +36,7 @@
 #include access/xlog_internal.h
 #include libpq/pqsignal.h
 #include miscadmin.h
+#include pgstat.h
 #include postmaster/fork_process.h
 #include postmaster/pgarch.h
 #include postmaster/postmaster.h
@@ -46,6 +47,7 @@
 #include storage/pmsignal.h
 #include utils/guc.h
 #include utils/ps_status.h

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread knizhnik
From my point of view it is not a big problem that it is not possible 
to place LWLock in DSM.
I can allocate LWLocks in standard way - using RequestAddinLWLocks and 
use them for synchronization.


Concerning support of huge pages - actually I do not think that it 
should involve something more than just setting MAP_HUGETLB flag.
Allocation of correspondent number of huge pages should be done by 
system administrator.


And what I still do not completely understand - how DSM enforces that 
segment created by one PosatgreSQL process will be mapped to the same 
virtual memory address in all other PostgreSQL processes.
As far as I understand right now (with standard PostgreSQL shared memory 
segments) it is enforced by fork().
Shared memory segments are allocated in one process and all other 
processes are forked from this process inheriting this memory segments.


But if new DSM segment is allocated at during execution of some query, 
then we should add it to virtual space of all PostgreSQL processes. Even 
if we somehow notify them all about presence of new segment, there is 
absolutely no warranty that all of them can map this segment to the 
specified memory address (it can be for some reasons already used by 
some other shared object).
Or may be DSM doesn't guarantee than DSM segment is mapped to the same 
address in all processes?
In this case it significantly complicates DSM usage: it will not be 
possible to use direct pointers.


Can you clarify me please how dynamically allocated DSM segments will be 
shared by all PostgreSQL processes?



On 01/05/2014 08:50 PM, Robert Haas wrote:

On Sat, Jan 4, 2014 at 3:27 PM, knizhnik knizh...@garret.ru wrote:

1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic
shared memory), like 9.2, 9.3.1,...

Yeah.  If it's loaded at postmaster start time, then it can work with
any version.  On 9.4+, you could possibly make it work even if it's
loaded on the fly by using the dynamic shared memory facilities.
However, there are currently some limitations to those facilities that
make some things you might want to do tricky.  There are pending
patches to lift some of these limitations.


2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash,
hash_search,...)
May be I missed something - I just noticed DSM and have no chance to
investigate it, but looks like hash table can not be allocated in DSM...

It wouldn't be very difficult to write an analog of ShmemInitHash() on
top of the dsm_toc patch that is currently pending.  A problem,
though, is that it's not currently possible to put LWLocks in dynamic
shared memory, and even spinlocks will be problematic if
--disable-spinlocks is used.  I'm due to write a post about these
problems; perhaps I should go do that.


3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have
to provide own allocator (although creation of non-releasing memory
allocator should not be a big issue).

The dsm_toc infrastructure would solve this problem.


4. Current implementation of DSM still suffers from 256Gb problem. Certainly
I can create multiple segments and so provide workaround without using huge
pages, but it complicates allocator.

So it sounds like DSM should also support huge pages somehow.  I'm not
sure what that should look like.


5. I wonder if I dynamically add new DSM segment - will it be available for
other PostgreSQL processes? For example I run query which loads data in IMCS
and so needs more space and allocates new DSM segment. Then another query is
executed by other PostgreSQL process which tries to access this data. This
process is not forked from the process created this new DSM segment, so I do
not understand how this segment will be mapped to the address space of this
process, preserving address... Certainly I can prohibit dynamic extension of
IMCS storage (hoping that in this case there will be no such problem with
DSM). But in this case we will loose the main advantage of using DSM instead
of old schema of plugin's private shared memory.

You can definitely dynamically add a new DSM segment; that's the point
of making it *dynamic* shared memory.  What's a bit tricky as things
stand today is making sure that it sticks around.  The current model
is that the DSM segment is destroyed when the last process unmaps it.
It would be easy enough to lift that limitation on systems other than
Windows; we could just add a dsm_keep_until_shutdown() API or
something similar.  But on Windows, segments are *automatically*
destroyed *by the operating system* when the last process unmaps them,
so it's not quite so clear to me how we can allow it there.  The main
shared memory segment is no problem because the postmaster always has
it mapped, even if no one else does, but that doesn't help for dynamic
shared memory segments.


6. IMCS has some configuration parameters which has to be set through
postgresql.conf. So in any case user has to edit postgresql.conf file.
In case of using DSM 

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread james

On 05/01/2014 18:02, Robert Haas wrote:

On Sun, Jan 5, 2014 at 12:34 PM, jamesja...@mansionfamily.plus.com  wrote:

On 05/01/2014 16:50, Robert Haas wrote:

  But on Windows, segments are*automatically*
destroyed*by the operating system*  when the last process unmaps them,
so it's not quite so clear to me how we can allow it there.  The main
shared memory segment is no problem because the postmaster always has
it mapped, even if no one else does, but that doesn't help for dynamic
shared memory segments.

Surely you just need to DuplicateHandle into the parent process?  If you
want to (tidily) dispose of it at some time, then you'll need to tell the
postmaster that you have done so and what the handle is in its process,
but if you just want it to stick around, then you can just pass it up.

Uh, I don't know, maybe?  Does the postmaster have to do something to
receive the duplicated handle


In principle, no, so long as the child has a handle to the parent 
process that has

the appropriate permissions.  Given that these processes have a parent/child
relationship that shouldn't be too hard to arrange.

, or can the child just throw it over the
wall to the parent and let it rot until the postmaster finally exits?

Yes.  Though it might be a good idea to record the handle somewhere (perhaps
in a table) so that any potential issues from an insane system spamming 
the postmaster

with handles are apparent.

I'm intrigued - how are the handles shared between children that are peers
in the current scheme?  Some handle transfer must already be in place.

Could you share the handles to an immortal worker if you want to reduce any
potential impact on the postmaster?

The latter would be nicer for our purposes, perhaps, as running more
code from within the postmaster is risky for us.  If a regular backend
process dies, the postmaster will restart everything and the database
will come back on line, but if the postmaster itself dies, we're hard
down.

-- Robert Haas EnterpriseDB: http://www.enterprisedb.com The 
Enterprise PostgreSQL Company




Re: [HACKERS] dynamic shared memory and locks

2014-01-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 For what it's worth, my vote is currently for #2.  I can't think of
 many interesting to do with dynamic shared memory without having at
 least spinlocks, so I don't think we'd be losing much.  #1 seems
 needlessly unfriendly, #3 seems like a lot of work for not much, and
 #4 seems excessive at least as a solution to this particular problem,
 though there may be other arguments for it.  What do others think?

I agree with this position.  There may be some good reason to drop
--disable-spinlocks altogether in future, but DSM isn't a sufficient
excuse.

 I think we're also going to want to be able to create LWLocks in
 dynamic shared memory: some critical sections won't be short enough or
 safe enough to be protected by spinlocks.  At some level this seems
 easy: change LWLockAcquire and friends to accept an LWLock * rather
 than an LWLockId, and similarly change held_lwlocks[] to hold LWLock
 pointers rather than LWLockIds.

I seem to recall that there was some good reason for keeping all the
LWLocks in an array, back when the facility was first designed.
I'm too lazy to research the point right now, but you might want to
go back and look at the archives around when lwlock.c was written.

 creating a large number of new globals that will need to be
 initialized in every new EXEC_BACKEND process seems irritating.

This might've been the good reason, but not sure --- I think LWLocks
predate our Windows support.

In the end, though, that decision was taken before we were concerned
about being able to add new LWLocks on the fly, which is what this is
really about (whether they're stored in DSM or not is a secondary point).
The pressure for that has gotten strong enough that it may be time to
change the tradeoff.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for pg_stat_archiver view

2014-01-05 Thread Fabrizio Mello
Enviado via iPhone

 Em 05/01/2014, às 16:27, Gabriele Bartolini 
 gabriele.bartol...@2ndquadrant.it escreveu:
 
 Il 05/01/14 13:52, Magnus Hagander ha scritto:
 Yes, that's what I'm talking about. And just increment it by 1.
 Done. I am attaching version 2 of the patch, which now implements only
 one function (pg_stat_get_archiver()) and adds:
 
 * failed attempts
 * WAL of the last failed attempt
 * time of the last failed attempt

Hi, 

I don't see your code yet, but I would like to know if is possible to implement 
this view as an extension.

Regards,

Fabrízio Mello

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic shared memory and locks

2014-01-05 Thread Andres Freund
On 2014-01-05 14:06:52 -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  For what it's worth, my vote is currently for #2.  I can't think of
  many interesting to do with dynamic shared memory without having at
  least spinlocks, so I don't think we'd be losing much.  #1 seems
  needlessly unfriendly, #3 seems like a lot of work for not much, and
  #4 seems excessive at least as a solution to this particular problem,
  though there may be other arguments for it.  What do others think?
 
 I agree with this position.  There may be some good reason to drop
 --disable-spinlocks altogether in future, but DSM isn't a sufficient
 excuse.

Agreed that DSM isn't sufficient cause. The reasons for removing it for
future reasons I see are:
* It's not tested at all and it has been partially broken for
  significants of time. Afair Heikki just noticed the breakage
  accidentally when adding enough new spinlocks recently.
* It's showed up as problematic in a couple of patches while adding not
  much value (at least dsm, atomic ops, afair some others)
* It's slow enough that it's not of a practical value.
* Implementing simple support for spinlocks on realistic platforms isn't
  hard these days due to compiler intrinsics.
* The platforms that don't have a barrier implementation will rely on
  spinlocks. And for correctness those spinlocks should employ
  barriers. That might be more of an argument for getting rid of that
  fallback tho.


  I think we're also going to want to be able to create LWLocks in
  dynamic shared memory: some critical sections won't be short enough or
  safe enough to be protected by spinlocks.  At some level this seems
  easy: change LWLockAcquire and friends to accept an LWLock * rather
  than an LWLockId, and similarly change held_lwlocks[] to hold LWLock
  pointers rather than LWLockIds.
 
 I seem to recall that there was some good reason for keeping all the
 LWLocks in an array, back when the facility was first designed.
 I'm too lazy to research the point right now, but you might want to
 go back and look at the archives around when lwlock.c was written.

Your proposal is at
http://www.postgresql.org/message-id/1054.1001520...@sss.pgh.pa.us -
afaics there hasn't been much discussion about implementation details at
that level.

 In the end, though, that decision was taken before we were concerned
 about being able to add new LWLocks on the fly, which is what this is
 really about (whether they're stored in DSM or not is a secondary point).
 The pressure for that has gotten strong enough that it may be time to
 change the tradeoff.

I personally find the sharing of a cacheline between a buffer headers
spinlock and the lwlock much more interesting than DSM...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-01-05 Thread Robert Haas
On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I would suggest addressing Robert's concern about lack of error checking
 by refusing to allow a custom reloption to be set unless the relevant
 extension is loaded and checks it.  Unlike the postgresql.conf problem,
 I don't see any very good use-case for allowing an unchecked ALTER TABLE
 to occur.

How do you plan to resolve the associated dump/restore hazard?  AIUI,
that's why we allow people define any old this.that GUC that they want
without checking it - because the relevant shared library might not be
loaded at the time of definition, but only by time of use.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier

2014-01-05 Thread Mark Dilger
I am building a regression test system for replication and came across
this email thread.  I have gotten pretty far into my implementation, but
would be happy to make modifications if folks have improvements to
suggest.  If the community likes my design, or a modified version based
on your feedback, I'd be happy to submit a patch.

Currently I am canibalizing src/test/pg_regress.c, but that could instead
be copied to src/test/pg_regress_replication.c or whatever.  The regression
test creates and configures multiple database clusters, sets up the
replication configuration for them, runs them each in nonprivileged mode
and bound to different ports, feeds all the existing 141 regression tests
into the master database with the usual checking that all the right results
are obtained, and then checks that the standbys have the expected
data.  This is possible all on one system because the database clusters
are chroot'ed to see their own /data directory and not the /data directory
of the other chroot'ed clusters, although the rest of the system, like /bin
and /etc and /dev are all bind mounted and visible to each cluster.

There of course is room to add as many replication tests as you like,
and the main 141 tests fed into the master could be extended to feed
more data and such.

The main drawbacks that I don't care for are:

1) 'make check' becomes 'sudo make check' because it needs permission
to run chroot.

2) I have no win32 version of the logic

3) Bind mounts either have to be created by the privileged pg_regress
process or have to be pre-existing on the system


#1 would not be as bad if pg_regress became pg_regress_replication, as
we could make the mantra into 'sudo make replicationcheck' or similar.
Splitting it from 'make check' also means IMHO that it could have heavier
tests that take longer to run, since people merely interested in building
and installing postgres would not be impacted by this.

#2 might be fixed by someone more familiar with win32 programming 
than I am.

#3 cannot be avoided as far as I can tell, but we could chose between
the two options.  So far, I have chosen to set up the directory structure
and add the bind mount logic to my /etc/fstab only once, rather than
having this get recreated every time I invoke 'sudo make check'.  The
community might prefer to go the other way, and have the directories
and bind mounts get set up each invocation; I have avoided that thus
far as I don't want 'sudo make check' (or 'sudo make replicationcheck')
to abuse its raised privileges and muck with the filesystem in a way
that could cause the user unexpected problems.


The main advantages that I like about this design are:

1) Only one system is required.  The developer does not need network
access to a second replication system.  Moreover, multiple database
clusters can be established with interesting replication hierarchies between
them, and the cost of each additional cluster is just another chroot
environment

2) Checking out the sources from git and then running 

    ./configure  make  sudo make replicationtest

is not particularly difficult, assuming the directories and mounts are
in place, or alternatively assuming that 'sudo make regressioncheck'
creates them for you if they don't already exist.

Comments and advice sincerely solicited,

mark

Re: [HACKERS] [PATCH] SQL assertions prototype

2014-01-05 Thread Peter Eisentraut
On 12/18/13, 2:22 PM, Andres Freund wrote:
 It would only force serialization for transactions that modify tables
 covered by the assert, that doesn't seem to bad. Anything covered by an
 assert shoulnd't be modified frequently, otherwise you'll run into major
 performance problems.

I think that makes sense.  If you want to use assertions, you need to
run in serializable mode, otherwise you get an error if you modify
anything covered by an assertion.

In the future, someone could enhance this for other isolation levels,
but as Josh has pointed out, that would likely just be reimplementing
SSI with big locks.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2014-01-05 Thread Peter Eisentraut
On 12/18/13, 2:45 PM, Andres Freund wrote:
 Well, the patch's syntax seems to only allow to directly specify a SQL
 query to check - we could iterate over the querytree to gather all
 related tables and reject any function we do not understand.

Creating an assertion scans the query tree and attaches constraint
triggers to all tables covered by the assertion condition.  So it does
essentially work the way you say.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2014-01-05 Thread Peter Eisentraut
On 12/18/13, 4:04 PM, Andrew Dunstan wrote:
 Umm, that's really a major limitation in utility. We need to come up
 with a better answer than this, which would essentially hobble the
 facility.

We don't have any facility to run a trigger on just any command, it
needs to be triggered by a table.  Of course, that's where your on
commit even triggers come in, presumably.  But for example, constraint
triggers support deferrability, which an on commit trigger wouldn't.
We'd just need DML triggers on any/no tables.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] more psprintf() use

2014-01-05 Thread Peter Eisentraut
On 1/2/14, 9:28 AM, Tom Lane wrote:
 Heikki is right and you are wrong.  There is an ancient supposition that
 datatype output functions, in particular, always return palloc'd strings.
 
 I recently got rid of the pfree's in the main output path, cf commit
 b006f4ddb988568081f8290fac77f9402b137120, which might explain why this
 patch passes regression tests; but there are still places in the code (and
 even more likely in third-party code) that will try to pfree the results.

Well, that seems kind of dangerous.  The next guys is going to write an
extension that is returning string constants directly, and there is no
straightforward way to detect this problem.  Perhaps we should have some
mode similar to the CLOBBER and COPY_*_TREES symbols to force a pfree()
in assertion-enabled builds?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] more psprintf() use

2014-01-05 Thread Peter Eisentraut
On 1/2/14, 2:12 PM, Alvaro Herrera wrote:
 Peter Eisentraut wrote:
 
 psprintf() in place of hardcoded palloc(N) + sprintf() and the like.

 
 +values[j++] = psprintf(%d, stat.blkno);
 +values[j++] = psprintf(%c, stat.type);
 +values[j++] = psprintf(%d, stat.live_items);
 +values[j++] = psprintf(%d, stat.dead_items);
 +values[j++] = psprintf(%d, stat.avg_item_size);
 +values[j++] = psprintf(%d, stat.page_size);
 +values[j++] = psprintf(%d, stat.free_size);
 +values[j++] = psprintf(%d, stat.btpo_prev);
 +values[j++] = psprintf(%d, stat.btpo_next);
 +values[j++] = psprintf(%d, (stat.type == 'd') ? stat.btpo.xact : 
 stat.btpo.level);
 +values[j++] = psprintf(%d, stat.btpo_flags);
  
  tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc),
 values);
 
 In cases such as this one, I have often wondered whether it'd be better
 to write this as DatumGetSometype() plus heap_form_tuple, instead of
 printing to strings and then building a tuple from those.

Probably.  As you can see, this style is only used in a few contrib
modules that all came from the same source, I think.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-01-05 Thread Emre Hasegeli
2013-12-17 Emre Hasegeli e...@hasegeli.com:
 Query planner never chooses to use the index for the operators which
 the index is particularly useful because selectivity estimation functions
 are missing. I am planning to work on them.

Attached patch adds selectivity estimation functions for the overlap and
adjacent operators. Other operators need a bit more work. I want to send it
before to get some feedback.


inet-selfuncs-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] more psprintf() use

2014-01-05 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 1/2/14, 9:28 AM, Tom Lane wrote:
 Heikki is right and you are wrong.  There is an ancient supposition that
 datatype output functions, in particular, always return palloc'd strings.
 
 I recently got rid of the pfree's in the main output path, cf commit
 b006f4ddb988568081f8290fac77f9402b137120, which might explain why this
 patch passes regression tests; but there are still places in the code (and
 even more likely in third-party code) that will try to pfree the results.

 Well, that seems kind of dangerous.  The next guys is going to write an
 extension that is returning string constants directly, and there is no
 straightforward way to detect this problem.  Perhaps we should have some
 mode similar to the CLOBBER and COPY_*_TREES symbols to force a pfree()
 in assertion-enabled builds?

Seems kinda backwards.  If we want to put any effort into this issue,
it'd be better to head in the direction of making the world safe for
output functions to return constants, ie deprecate rather than enforce
the practice of pfree'ing their results.  But see
http://www.postgresql.org/message-id/12646.1383420...@sss.pgh.pa.us

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-01-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I would suggest addressing Robert's concern about lack of error checking
 by refusing to allow a custom reloption to be set unless the relevant
 extension is loaded and checks it.  Unlike the postgresql.conf problem,
 I don't see any very good use-case for allowing an unchecked ALTER TABLE
 to occur.

 How do you plan to resolve the associated dump/restore hazard?

pg_dump creates extensions before tables, no?  So what dump/restore
hazard?

 AIUI,
 that's why we allow people define any old this.that GUC that they want
 without checking it - because the relevant shared library might not be
 loaded at the time of definition, but only by time of use.

No, the reason we allow GUCs to be set before the relevant library is
loaded is so that you can put a setting into postgresql.conf without
thereby having to make the extension be load-into-postmaster.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2014-01-05 Thread Andres Freund


Peter Eisentraut pete...@gmx.net schrieb:
On 12/18/13, 2:22 PM, Andres Freund wrote:
 It would only force serialization for transactions that modify tables
 covered by the assert, that doesn't seem to bad. Anything covered by
an
 assert shoulnd't be modified frequently, otherwise you'll run into
major
 performance problems.

I think that makes sense.  If you want to use assertions, you need to
run in serializable mode, otherwise you get an error if you modify
anything covered by an assertion.

In the future, someone could enhance this for other isolation levels,
but as Josh has pointed out, that would likely just be reimplementing
SSI with big locks.

SSI only actually works correctly if all transactions use SSI... I am not sure 
if we can guarantee that the subset we'd require'd be safe without the read sie 
using SSI.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier

2014-01-05 Thread Greg Stark
-- 
greg
On 5 Jan 2014 14:54, Mark Dilger markdil...@yahoo.com wrote:

 I am building a regression test system for replication and came across
 this email thread.  I have gotten pretty far into my implementation, but
 would be happy to make modifications if folks have improvements to
 suggest.  If the community likes my design, or a modified version based
 on your feedback, I'd be happy to submit a patch.

This sounds pretty cool. The real trick will be in testing concurrent
behaviour -- I.e. queries on the slave when it's replaying logs at a
certain point. But right now we have nothing so anything would be an
improvement.

  This is possible all on one system because the database clusters
 are chroot'ed to see their own /data directory and not the /data directory
 of the other chroot'ed clusters, although the rest of the system, like
/bin
 and /etc and /dev are all bind mounted and visible to each cluster.

This isn't necessary. You can use the same binaries and run initdb with a
different location just fine. Then start up the database with -D to specify
the directory.


Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier

2014-01-05 Thread Michael Paquier
On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com wrote:
 I am building a regression test system for replication and came across
 this email thread.  I have gotten pretty far into my implementation, but
 would be happy to make modifications if folks have improvements to
 suggest.  If the community likes my design, or a modified version based
 on your feedback, I'd be happy to submit a patch.
Yeah, this would be nice to look at, core code definitely needs to have
some more infrastructure for such a test suite. I didn't get the time to go
back to it since I began this thread though :)

 Currently I am canibalizing src/test/pg_regress.c, but that could instead
 be copied to src/test/pg_regress_replication.c or whatever.  The
regression
 test creates and configures multiple database clusters, sets up the
 replication configuration for them, runs them each in nonprivileged mode
 and bound to different ports, feeds all the existing 141 regression tests
 into the master database with the usual checking that all the right
results
 are obtained, and then checks that the standbys have the expected
 data.  This is possible all on one system because the database clusters
 are chroot'ed to see their own /data directory and not the /data directory
 of the other chroot'ed clusters, although the rest of the system, like
/bin
 and /etc and /dev are all bind mounted and visible to each cluster.
Having vanilla regressions run in a cluster with multiple nodes and check
the results on a standby is the top of the iceberg though. What I had in
mind when I began this thread was to have more than a copy/paste of
pg_regress, but an infrastructure that people could use to create and
customize tests by having an additional control layer on the cluster
itself. For example, testing replication is not only a matter of creating
and setting up the nodes, but you might want to be able to initialize, add,
remove nodes during the tests. Node addition would be either a new fresh
master (this would be damn useful for a test suite for logical replication
I think), or a slave node with custom recovery parameters to test
replication, as well as PITR, archiving, etc. Then you need to be able to
run SQL commands on top of that to check if the results are consistent with
what you want.

A possible input for a test that users could provide would be something
like that:
# Node information for tests
nodes
{
{node1, postgresql.conf params, recovery.conf params}
{node2, postgresql.conf params, recovery.conf params, slave of node1}
}
# Run test
init node1
run_sql node1 file1.sql
# Check output
init node2
run_sql node2 file2.sql
# Check that results are fine
# Process

The main problem is actually how to do that. Having some smart shell
infrastructure would be simple and would facilitate (?) the maintenance of
code used to run the tests. On the contrary having a C program would make
the maintenance of code to run the tests more difficult (?) for a trade
with more readable test suite input like the one I wrote above. This might
also make the test input more readable for a human eye, in the shape of
what is already available in src/test/isolation.

Another possibility could be also to integrate directly a recovery/backup
manager in PG core, and have some tests for it, or even include those tests
directly with pg_basebackup or an upper layer of it.

 There of course is room to add as many replication tests as you like,
 and the main 141 tests fed into the master could be extended to feed
 more data and such.

 The main drawbacks that I don't care for are:

 1) 'make check' becomes 'sudo make check' because it needs permission
 to run chroot.
-1 for that developers should not need to use root to run regression suite.

 2) I have no win32 version of the logic
For a first shot I am not sure that it matters much.

 The main advantages that I like about this design are:

 1) Only one system is required.  The developer does not need network
 access to a second replication system.  Moreover, multiple database
 clusters can be established with interesting replication hierarchies
between
 them, and the cost of each additional cluster is just another chroot
 environment
An assumption of the test suite is I think to allow developers to check for
bugs on a local server only. This facilitates how the test suite is written
and you don't need to enter in things like VM settings or cross-environment
tests, things that could be done already nicely by frameworks of the type
Jenkins. What I think people would like to have is that:
cd src/test/replication  make check/installcheck
And have the test run for them.

Regards,
-- 
Michael


Re: [HACKERS] dynamic shared memory and locks

2014-01-05 Thread Robert Haas
On Sun, Jan 5, 2014 at 2:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I seem to recall that there was some good reason for keeping all the
 LWLocks in an array, back when the facility was first designed.
 I'm too lazy to research the point right now, but you might want to
 go back and look at the archives around when lwlock.c was written.

To some extent it's an orthogonal question.  It's true that there
isn't much point in using LWLock * rather than LWLockId to refer to
LWLocks unless we wish to be able to store them outside the shared
memory segment, but the reverse is not true: just because we have the
ability to move things outside the main array in the general case
doesn't make it a good idea in any particular case.  Andres's email
seems to indicate that he sees performance advantages in moving buffer
locks elsewhere, and I have a sneaking suspicion that we'll find that
it's more convenient to move some other things around as well, but
that's policy, not mechanism.  Very little of the core LWLock
machinery actually cares how the locks are stored, so the attached
patch to make it use LWLock * rather than LWLockId as a handle is
pretty straightforward.

The only real problem I see here is that we occasionally *print out*
LWLockIds as a way of identifying lwlocks.  This is not a great
system, but printing out pointers is almost certainly much worse (e.g.
because of ASLR).  The cases where this is currently an issue are:

- You try to release a lwlock you haven't acquired.  We elog(ERROR) the ID.
- You define LWLOCK_STATS.  The resulting reports are print the lock ID.
- You define LOCK_DEBUG and set trace_lwlocks=true.  We print the lock
ID in the trace messages.
- You compile with --enable-dtrace.  We pass the lock IDs to the dtrace probes.

In the attached patch I handled the first case by printing the pointer
(which I don't feel too bad about) and the remaining three cases by
leaving them broken.  I wouldn't shed a tear about ripping out
trace_lwlocks, but LWLOCK_STATS is useful and I bet somebody is using
--enable-dtrace, so we probably need to fix those cases at least.  I
suppose one option is to make LWLOCK_STATS and the dtrace probes only
look at locks in the main array and just ignore everything else.  But
that's kind of crappy, especially if we're going to soon move buffer
locks out of the main array.

Another idea is to include some identifying information in the lwlock.
 For example, each lwlock could have a char *name in it, and we could
print the name.  In theory, this could be a big step forward in terms
of usability, because it'd spare us all needing to remember that 4 ==
ProcArrayLock.  But it's awkward for buffer locks, of which there
might be a great many, and we surely don't want to allocate a
dynamically-generated string in shared memory for each one.  You could
do a bit better by making the identifying information a string plus an
integer, because then all the buffer locks could set the string to a
static constant like buffer content lock and the integer to the
buffer number, and similarly for lock manager partition locks and so
on.  This is appealing, but would increase the size of LWLockPadded
from 16 bytes to 32 on 64-bit platforms where slock_t is four bytes or
less, which I'm not that excited about even though it would reduce
cache line contention in some cases.

Yet a third idea is to try to reverse-engineer a name for a given
lwlock from the pointer address.  If it's an offset into the main
array, this is easy enough to do, and even if we ended up with several
arrays (like one for bufmgr locks) it wouldn't be too hard to write
code to figure out which array contains it and emit the appropriate
string.  The only real problem that I see with this is that it might
cause a performance hit.  A performance hit when running with
trace_lwlocks or LWLOCK_STATS is not really a problem, but people
won't like if --enable-dtrace slow things up.

Preferences, other ideas?

None of these ideas are a complete solution for LWLOCK_STATS.  In the
other three cases noted above, we only need an identifier for the lock
instantaneously, so that we can pass it off to the logger or dtrace
or whatever.  But LWLOCK_STATS wants to hold on to data about the
locks that were visited until the end of the session, and it does that
using an array that is *indexed* by lwlockid.  I guess we could
replace that with a hash table.  Ugh.  Any suggestions?

(Incidentally, while developing this patch I found a bug in the
current code: lock.c iterates over all PGPROCs from 0 to
ProcGlobal-allProcCount and takes the backendLock for each, but if
max_prepared_transactions0 then the PGPROCs for prepared transactions
do not have a backendLock and so we take and release BufFreelistLock -
i.e. 0 - a number of times equal to max_prepared_transactions.  That's
not cool.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/transam/slru.c 

Re: [HACKERS] [PATCH] Store Extension Options

2014-01-05 Thread Robert Haas
On Sun, Jan 5, 2014 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I would suggest addressing Robert's concern about lack of error checking
 by refusing to allow a custom reloption to be set unless the relevant
 extension is loaded and checks it.  Unlike the postgresql.conf problem,
 I don't see any very good use-case for allowing an unchecked ALTER TABLE
 to occur.

 How do you plan to resolve the associated dump/restore hazard?

 pg_dump creates extensions before tables, no?  So what dump/restore
 hazard?

Creating the extension doesn't guarantee that the shared library will
always be loaded.  If nothing else, think about partial restores.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread Robert Haas
On Sun, Jan 5, 2014 at 1:28 PM, knizhnik knizh...@garret.ru wrote:
 From my point of view it is not a big problem that it is not possible to
 place LWLock in DSM.
 I can allocate LWLocks in standard way - using RequestAddinLWLocks and use
 them for synchronization.

Sure, well, that works fine if you're being loaded from
shared_preload_libraries.  If you want to be able to load the
extension after startup time, though, it's no good.

 And what I still do not completely understand - how DSM enforces that
 segment created by one PosatgreSQL process will be mapped to the same
 virtual memory address in all other PostgreSQL processes.

It doesn't.  One process calls dsm_create() to create a shared memory
segment.  Other processes call dsm_attach() to attach it.  There's no
guarantee that they'll map it at the same address; they'll just map it
somewhere.

 Or may be DSM doesn't guarantee than DSM segment is mapped to the same
 address in all processes?
 In this case it significantly complicates DSM usage: it will not be possible
 to use direct pointers.

Yeah, that's where we're at.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread Robert Haas
On Sun, Jan 5, 2014 at 1:44 PM, james ja...@mansionfamily.plus.com wrote:
 I'm intrigued - how are the handles shared between children that are peers
 in the current scheme?  Some handle transfer must already be in place.

That's up to the application.  After calling dsm_create(), you call
dsm_segment_handle() to get the 32-bit integer handle for that
segment.  Then you have to get that to the other process(es) somehow.
If you're trying to share a handle with a background worker, you can
stuff it in bgw_main_arg.  Otherwise, you'll probably need to store it
in the main shared memory segment, or a file, or whatever.

 Could you share the handles to an immortal worker if you want to reduce any
 potential impact on the postmaster?

You could, but this seems like this justification for spawning another
process, and how immortal is that worker really?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-01-05 Thread Fabrízio de Royes Mello
On Mon, Jan 6, 2014 at 1:08 AM, Robert Haas robertmh...@gmail.com wrote:

 On Sun, Jan 5, 2014 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I would suggest addressing Robert's concern about lack of error
checking
  by refusing to allow a custom reloption to be set unless the relevant
  extension is loaded and checks it.  Unlike the postgresql.conf
problem,
  I don't see any very good use-case for allowing an unchecked ALTER
TABLE
  to occur.
 
  How do you plan to resolve the associated dump/restore hazard?
 
  pg_dump creates extensions before tables, no?  So what dump/restore
  hazard?

 Creating the extension doesn't guarantee that the shared library will
 always be loaded.  If nothing else, think about partial restores.


You are correct. pg_dump export reloptions using WITH clause of CREATE
TABLE statement. I.e.:

CREATE TABLE foo (
)
WITH (autovacuum_enabled=false, bdr.do_replicate=false);

So if this statement checks for 'bdr' extension is loaded then in partial
restore it can be fail. At this point we have two choices:

1) do not check if extension already is loaded

2) hack the pg_dump to produce an ALTER TABLE ... SET (...) instead of
CREATE TABLE ... WITH (...) to set reloptions

Comments?

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


[HACKERS] Compiling extensions on Windows

2014-01-05 Thread Craig Ringer
Hi all

Out of personal interest (in pain and suffering) I was recently looking
into how to compile extensions out-of-tree on Windows using Visual
Studio (i.e. no PGXS).

It looks like the conventional answer to this is Do a source build of
PG, compile your ext in-tree in contrib/, and hope the result is binary
compatible with release PostgreSQL builds for Windows. Certainly that's
how I've been doing it to date.

How about everyone else here? Does anyone actually build and distribute
extensions out of tree at all?

I'm interested in making the Windows installer distributions a bit more
extension dev friendly. In particular, I'd really like to see EDB's
Windows installers include the libintl.h for the included libintl, since
its omission, combined with Pg being built with ENABLE_NLS, tends to
break things horribly. Users can always undefine ENABLE_NLS, but it's an
unnecessary roadblock.

Are there any objections from -hackers to including 3rd party headers
for libs we expose in our public headers in the binary distribution?

Other than bundling 3rd party headers, any ideas/suggestions for how we
might make ext building saner on Windows?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-01-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jan 5, 2014 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 pg_dump creates extensions before tables, no?  So what dump/restore
 hazard?

 Creating the extension doesn't guarantee that the shared library will
 always be loaded.

No, but unless the plan is that no validation happens at all (which
I gather is not your desire) then there must be some mechanism for
figuring out which extension owns a given reloption and asking it
to validate the value.  This might be more complicated than a passive
hook, but I don't feel bad about demanding that it work like that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-01-05 Thread Tom Lane
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 You are correct. pg_dump export reloptions using WITH clause of CREATE
 TABLE statement. I.e.:

 CREATE TABLE foo (
 )
 WITH (autovacuum_enabled=false, bdr.do_replicate=false);

 So if this statement checks for 'bdr' extension is loaded then in partial
 restore it can be fail.

I see absolutely *nothing* wrong with failing that command if bdr is not
installed.  For an analogy, if this table includes a column of type bar
defined by some extension baz, we are certainly going to fail the
CREATE TABLE if baz isn't installed.

Now, if bdr is installed but the validation doesn't happen unless bdr
is loaded in some sense, then that is an implementation deficiency
that I think we can insist be rectified before this feature is accepted.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compiling extensions on Windows

2014-01-05 Thread Andrew Dunstan


On 01/05/2014 10:32 PM, Craig Ringer wrote:

Hi all

Out of personal interest (in pain and suffering) I was recently looking
into how to compile extensions out-of-tree on Windows using Visual
Studio (i.e. no PGXS).

It looks like the conventional answer to this is Do a source build of
PG, compile your ext in-tree in contrib/, and hope the result is binary
compatible with release PostgreSQL builds for Windows. Certainly that's
how I've been doing it to date.

How about everyone else here? Does anyone actually build and distribute
extensions out of tree at all?

I'm interested in making the Windows installer distributions a bit more
extension dev friendly. In particular, I'd really like to see EDB's
Windows installers include the libintl.h for the included libintl, since
its omission, combined with Pg being built with ENABLE_NLS, tends to
break things horribly. Users can always undefine ENABLE_NLS, but it's an
unnecessary roadblock.

Are there any objections from -hackers to including 3rd party headers
for libs we expose in our public headers in the binary distribution?

Other than bundling 3rd party headers, any ideas/suggestions for how we
might make ext building saner on Windows?



If you're bundling a DLL then I don't see why the corresponding header 
file shouldn't be included also.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-05 Thread Amit Kapila
On Sun, Jan 5, 2014 at 11:04 PM, james ja...@mansionfamily.plus.com wrote:
 On 05/01/2014 16:50, Robert Haas wrote:

  But on Windows, segments are *automatically*
 destroyed *by the operating system* when the last process unmaps them,
 so it's not quite so clear to me how we can allow it there.  The main
 shared memory segment is no problem because the postmaster always has
 it mapped, even if no one else does, but that doesn't help for dynamic
 shared memory segments.

 Surely you just need to DuplicateHandle into the parent process?

   Ideally DuplicateHandle should work, but while going through Windows
   internals of shared memory functions on below link, I observed that
   they mentioned it that it will work for child proceess.
   http://msdn.microsoft.com/en-us/library/ms810613.aspx
   Refer section Inheriting and duplicating memory-mapped file object
   handles

  If you
 want to (tidily) dispose of it at some time, then you'll need to tell the
 postmaster that you have done so and what the handle is in its process,
 but if you just want it to stick around, then you can just pass it up.

Duplicate handle should work, but we need to communicate the handle
to other process using IPC.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2014-01-05 Thread David Fetter
On Fri, Jan 03, 2014 at 03:06:11PM -0200, Claudio Freire wrote:
 On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Claudio Freire klaussfre...@gmail.com writes:
  On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote:
  Loading data into the database isn't such an uncommon task.  Not 
  everything
  is OLTP.
 
  Truly, but a sustained insert stream of 10 Mbps is certainly way
  beyond common non-OLTP loads. This is far more specific than non-OLTP.
 
  I think Florian has a good point there, and the reason is this: what
  you are talking about will be of exactly zero use to applications that
  want to see the results of one query before launching the next.  Which
  eliminates a whole lot of apps.  I suspect that almost the *only*
  common use case in which a stream of queries can be launched without
  feedback is going to be bulk data loading.  It's not clear at all
  that pipelining the PQexec code path is the way to better performance
  for that --- why not use COPY, instead?
 
 You're forgetting ORM workloads.
 
 ORMs can usually plan the inserts to be in a sequence that both don't
 require feedback (except the knowledge that they were successful), and
 that do not violate constraints.
 
 Flushing a whole object hierarchy for instance, can be done without
 feedback. Not even serial columns need feedback, since many ORMs
 (SQLAlchemy, Hibernate) support allocation of ID sequences in batches
 (by issuing a proper select nextval).

This is already doable using data-modifying WITH clauses.  Whether
ORMs are (or should be made) smart enough to take advantage of this is
probably out of the scope of this discussion, though.  For what it's
worth, I'm not in favor of orthogonality in languages, so +1 from me.
Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers