Re: [HACKERS] Fixing pg_basebackup with tablespaces found in $PGDATA
On 01/02/2014 06:53 AM, Dimitri Fontaine wrote: As much as I've seen people frown upon $subject, it still happens in the wild I met a new case of it a couple of weeks ago, so I can certainly confirm that. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for pg_stat_archiver view
On Sat, Jan 4, 2014 at 2:01 PM, Gabriele Bartolini gabriele.bartol...@2ndquadrant.it wrote: Il 04/01/14 13:25, Magnus Hagander ha scritto: With those two, I think it would make much sense to have a view like this. Ok, I will prepare version 2 with those. Oh, and you need to change the format id number of the stats file. I have not found any instruction on how to set it. I assume you are talking about this: PGSTAT_FILE_FORMAT_ID0x01A5BC9B Any suggestion is welcome. Yes, that's what I'm talking about. And just increment it by 1. Not sure where the original value came from, but that's what people have been doing recently. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] RFC: Async query processing
On 01/04/2014 01:06 AM, Claudio Freire wrote: You're forgetting ORM workloads. I'm impressed that you've come up with an area where ORMs are beneficial ;-) JDBC also has a statement batching interface. Right now PgJDBC just unwraps the batch and runs each query individually. Any async-support improvements server-side should probably consider the need of executing a batch. The batch might be one PreparedStatement with many different parameters, or it might be a series of unrelated statements. A way for PgJDBC to run the batch without syncing with the server after each query would be really helpful. So would a way to BIND an array of parameters, so we could execute a prepared statmenet once with multiple parameters and then sync up with the server after all executions. As for ORMs benefitting from this: Remember that nPgSQL and PgJDBC don't use libpq. So the libpq changes would only help ORMs based on things like Python (psycopg2), Ruby (Pg gem), etc, where they're using libpq wrapper drivers. Execute-many of prepared statements is another one, quite common. That's the case I'd really love to see proper server-side batch support for. BIND_MULTIPLE, EXECUTE. I'm not sure what would happen if one of the queries returned an error. If in a transaction, all the following queries would error out I'd imagine. If not, they would simply be executed blindly.. am I correct? It's not just dealing with erroring out. Many ORMs look at the count of rows affected to detect whether an operation conflicted with another concurrent operation when optimistic concurrency control is in use. E.g. UPDATE t SET x = 'fred' WHERE rowversion = 4; will be seen to fail if it reports that it's affected zero rows. This is one of the reasons ORM users have such serious problems with the write side of our partitioning support - we discard affected row counts, and many ORMs don't deal well with that. At least in JDBC, executeBatch returns an array of rowcounts. So you can't throw away affected row counts when running batches, they must be returned to the client. Doesn't matter if it's a single protocol message with a list of IDs, or a series of individual messages, though. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Async query processing
On 01/04/2014 01:22 AM, Merlin Moncure wrote: Long term, I'd rather see an optimized 'ORM flush' assemble the data into a structured data set (perhaps a JSON document) and pass it to some receiving routine that decomposed it into records. The same is true on the input side. I'd much rather be sending an ORM client a big JSON / YAML / whatever graph than a horrible, duplication-filled chained LEFT JOIN projection like they currently rely on. When they're not just doing n+1 selects, which is worse. I think that's really a side-issue though. ORMs aren't going to change in a hurry, and batching / fire-and-forget support is good for all sorts of other jobs too. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Async query processing
On Fri, Jan 3, 2014 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think Florian has a good point there, and the reason is this: what you are talking about will be of exactly zero use to applications that want to see the results of one query before launching the next. There are techniques for handling that actually. For a better explanation than I can do see http://kentonv.github.io/capnproto/rpc.html. It's mostly a language feature but it does require support from the protocol to be able to reference data in earlier responses in your subsequent requests. You can pull this trick by having an RPC server near the database so all the latency is handled by the RPC server and not the database connection. But it seems to me that that's unnecessary complication and shouldn't be necessary. The database protocol basically is an RPC layer. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Async query processing
On 01/05/2014 03:11 PM, Greg Stark wrote: On Fri, Jan 3, 2014 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: I think Florian has a good point there, and the reason is this: what you are talking about will be of exactly zero use to applications that want to see the results of one query before launching the next. There are techniques for handling that actually. For a better explanation than I can do see http://kentonv.github.io/capnproto/rpc.html. It's mostly a language feature but it does require support from the protocol to be able to reference data in earlier responses in your subsequent requests. Purely from a perspective of latency avoidance, sub-queries, WITH or stored procedures can achieve the same thing, and work even if the intermediate result has to undergo some transformation. :-) -- Florian Weimer / Red Hat Product Security Team -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Async query processing
On 01/04/2014 04:39 PM, Martijn van Oosterhout wrote: Why switch between COPY commands, why could you not do it in one? For example: COPY table1(col1, col2, ...), table2(col1, col2, ...) FROM STDIN WITH (tableoids); tableoid1tabcol1tabcol2... tableoid2tab... ... \. My originally idea was to avoid any server-side changes, so that applications do not need fallback code for talking to old servers. -- Florian Weimer / Red Hat Product Security Team -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On Sat, Jan 4, 2014 at 3:27 PM, knizhnik knizh...@garret.ru wrote: 1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic shared memory), like 9.2, 9.3.1,... Yeah. If it's loaded at postmaster start time, then it can work with any version. On 9.4+, you could possibly make it work even if it's loaded on the fly by using the dynamic shared memory facilities. However, there are currently some limitations to those facilities that make some things you might want to do tricky. There are pending patches to lift some of these limitations. 2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash, hash_search,...) May be I missed something - I just noticed DSM and have no chance to investigate it, but looks like hash table can not be allocated in DSM... It wouldn't be very difficult to write an analog of ShmemInitHash() on top of the dsm_toc patch that is currently pending. A problem, though, is that it's not currently possible to put LWLocks in dynamic shared memory, and even spinlocks will be problematic if --disable-spinlocks is used. I'm due to write a post about these problems; perhaps I should go do that. 3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have to provide own allocator (although creation of non-releasing memory allocator should not be a big issue). The dsm_toc infrastructure would solve this problem. 4. Current implementation of DSM still suffers from 256Gb problem. Certainly I can create multiple segments and so provide workaround without using huge pages, but it complicates allocator. So it sounds like DSM should also support huge pages somehow. I'm not sure what that should look like. 5. I wonder if I dynamically add new DSM segment - will it be available for other PostgreSQL processes? For example I run query which loads data in IMCS and so needs more space and allocates new DSM segment. Then another query is executed by other PostgreSQL process which tries to access this data. This process is not forked from the process created this new DSM segment, so I do not understand how this segment will be mapped to the address space of this process, preserving address... Certainly I can prohibit dynamic extension of IMCS storage (hoping that in this case there will be no such problem with DSM). But in this case we will loose the main advantage of using DSM instead of old schema of plugin's private shared memory. You can definitely dynamically add a new DSM segment; that's the point of making it *dynamic* shared memory. What's a bit tricky as things stand today is making sure that it sticks around. The current model is that the DSM segment is destroyed when the last process unmaps it. It would be easy enough to lift that limitation on systems other than Windows; we could just add a dsm_keep_until_shutdown() API or something similar. But on Windows, segments are *automatically* destroyed *by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. 6. IMCS has some configuration parameters which has to be set through postgresql.conf. So in any case user has to edit postgresql.conf file. In case of using DSM it will be not necessary to add IMCS to shared_preload_libraries list. But I do not think that it is so restrictive and critical requirement, is it? I don't really see a problem here. One of the purposes of dynamic shared memory (and dynamic background workers) is precisely that you don't *necessarily* need to put extensions that use shared memory in shared_preload_libraries - or in other words, you can add the extension to a running server without restarting it. If you know in advance that you will want it, you probably still *want* to put it in shared_preload_libraries, but part of the idea is that we can get away from requiring that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On 05/01/2014 16:50, Robert Haas wrote: But on Windows, segments are*automatically* destroyed*by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. Surely you just need to DuplicateHandle into the parent process? If you want to (tidily) dispose of it at some time, then you'll need to tell the postmaster that you have done so and what the handle is in its process, but if you just want it to stick around, then you can just pass it up.
[HACKERS] dynamic shared memory and locks
One of the things that you might want to do with dynamic shared memory is store a lock in it. In fact, my bet is that almost everything that uses dynamic shared memory will want to do precisely that, because, of course, it's dynamic *shared* memory, which means that it is concurrently accessed by multiple processes, which tends to require locking. Typically, what you're going to want are either spinlocks (for very short critical sections) or lwlocks (for longer ones). It doesn't really make sense to talk about storing heavyweight locks in dynamic shared memory, because we're talking about storing locks with the data structures that they protect, and heavyweight locks are used to protect database or shared objects, not shared memory structures. Of course, someone might think of trying to provide a mechanism for the heavyweight lock manager to overflow to dynamic shared memory, but that's a different thing altogether and not what I'm talking about here. Right now, storing spinlocks in dynamic shared memory *almost* works, but there are problems with --disable-spinlocks. In that configuration, we use semaphores to simulate spinlocks. Every time someone calls SpinLockInit(), it's going to allocate a new semaphore which will never be returned to the operating system, so you're pretty quickly going to run out. There are a couple of things we could do about this: 1. Decide we don't care. If you compile with --disable-spinlocks, and then you try to use dynamic shared memory, it's going to leak semaphores until none remain, and then start failing from there until the postmaster is restarted. If you don't like that, provide a working spinlock implementation for your platform. 2. Forbid the use of dynamic shared memory when compiling with --disable-spinlocks. This is a more polite version of #1. It seems likely to me that nearly every piece of code that uses dynamic shared memory will require locking. Instead of letting people allocate dynamic shared memory segments anyway and then having them start failing shortly after postmaster startup, we could just head the problem off at the pass by denying the request for dynamic shared memory in the first place. Dynamic shared memory allocation can always fail (e.g. because we're out of memory) and also has an explicit off switch that will make all requests fail (dynamic_shared_memory_type=none), so any code that uses dynamic shared memory has to be prepared for a failure at that point, whereas a failure in SpinLockInit() might be more surprising. 3. Provide an inverse for SpinLockInit, say SpinLockDestroy, and require all code written for dynamic shared memory to invoke this function on every spinlock before the shared memory segment is destroyed. I initially thought that this could be done using the on_dsm_detach infrastructure, but it turns out that doesn't really work. The on_dsm_detach infrastructure is designed to make sure that you *release* all of your locks when detaching - i.e. those hooks get invoked for each process that detaches. For this, you'd need an on_dsm_final_detach callback that gets called only for the very last detach (and after prohibiting any other processes from attaching). I can certainly engineer all that, but it's a decent amount of extra work for everyone who wants to use dynamic shared memory to write the appropriate callback, and because few people actually use --disable-spinlocks, I think those callbacks will tend to be rather lightly tested and thus a breeding ground for marginal bugs that nobody's terribly excited about fixing. 4. Drop support for --disable-spinlocks. For what it's worth, my vote is currently for #2. I can't think of many interesting to do with dynamic shared memory without having at least spinlocks, so I don't think we'd be losing much. #1 seems needlessly unfriendly, #3 seems like a lot of work for not much, and #4 seems excessive at least as a solution to this particular problem, though there may be other arguments for it. What do others think? I think we're also going to want to be able to create LWLocks in dynamic shared memory: some critical sections won't be short enough or safe enough to be protected by spinlocks. At some level this seems easy: change LWLockAcquire and friends to accept an LWLock * rather than an LWLockId, and similarly change held_lwlocks[] to hold LWLock pointers rather than LWLockIds. One tricky point is that you'd better try not to detach a shared memory segment while you're holding lwlocks inside that segment, but I think just making that a coding rule shouldn't cause any great problem, and conversely you'd better release all lwlocks in the segment before detaching it, but this seems mostly OK: throwing an error will call LWLockReleaseAll before doing the resource manager cleanups that will unmap the dynamic shared memory segment, so that's probably OK too. There may be corner cases I haven't thought about, though. A bigger problem is that I think we want to avoid
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On Sun, Jan 5, 2014 at 12:34 PM, james ja...@mansionfamily.plus.com wrote: On 05/01/2014 16:50, Robert Haas wrote: But on Windows, segments are *automatically* destroyed *by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. Surely you just need to DuplicateHandle into the parent process? If you want to (tidily) dispose of it at some time, then you'll need to tell the postmaster that you have done so and what the handle is in its process, but if you just want it to stick around, then you can just pass it up. Uh, I don't know, maybe? Does the postmaster have to do something to receive the duplicated handle, or can the child just throw it over the wall to the parent and let it rot until the postmaster finally exits? The latter would be nicer for our purposes, perhaps, as running more code from within the postmaster is risky for us. If a regular backend process dies, the postmaster will restart everything and the database will come back on line, but if the postmaster itself dies, we're hard down. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On 2014-01-05 12:56:05 -0500, Robert Haas wrote: Right now, storing spinlocks in dynamic shared memory *almost* works, but there are problems with --disable-spinlocks. In that configuration, we use semaphores to simulate spinlocks. Every time someone calls SpinLockInit(), it's going to allocate a new semaphore which will never be returned to the operating system, so you're pretty quickly going to run out. There are a couple of things we could do about this: 4. Drop support for --disable-spinlocks. I very strongly vote 4). I think we're going to hit this more and more often and it's a facility that benefits almost nobody. Just about every new platform will be/is on gcc or clang and you can just duplicate the compiler provided generic implementation we have for arm for there. The atomics implementation make this an automatic fallback if there's no compiler specific variant around. I think we're also going to want to be able to create LWLocks in dynamic shared memory: some critical sections won't be short enough or safe enough to be protected by spinlocks. Agreed. At some level this seems easy: change LWLockAcquire and friends to accept an LWLock * rather than an LWLockId, and similarly change held_lwlocks[] to hold LWLock pointers rather than LWLockIds. My primary reason isn't dsm TBH but wanting to embed the buffer lwlocks in the bufferdesc, on the same cacheline as the buffer headers spinlock. All the embedded ones can be allocated without padding, while the relatively low number of non-embedded ones can be padded to the full cacheline size. A bigger problem is that I think we want to avoid having a large amount of notational churn. The obvious way to do that is to get rid of the LWLockId array and instead declare each fixed LWLock separately as e.g. LWLock *ProcArrayLock. However, creating a large number of new globals that will need to be initialized in every new EXEC_BACKEND process seems irritating. So maybe a better idea is to do something like this: #define BufFreelistLock (fixedlwlocks[0]) #define ShmemIndexLock (fixedlwlocks[1]) ... #define AutoFileLock (fixedlwlocks[36]) #define NUM_FIXED_LWLOCKS 37 Comments, suggestions? My idea here was to just have two APIs, a legacy one that works like the current one, and a new one that locks the lwlocks passed in by a pointer. After all, LWLockAssign() which you use in extensions currently returns a LWLockId. Seems ugly to turn that into a pointer. But perhaps your idea is better anyway, no matter the hackery of turning LWLockId into a pointer. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for pg_stat_archiver view
Il 05/01/14 13:52, Magnus Hagander ha scritto: Yes, that's what I'm talking about. And just increment it by 1. Done. I am attaching version 2 of the patch, which now implements only one function (pg_stat_get_archiver()) and adds: * failed attempts * WAL of the last failed attempt * time of the last failed attempt Thanks for your inputs. Ciao, Gabriele -- Gabriele Bartolini - 2ndQuadrant Italia PostgreSQL Training, Services and Support gabriele.bartol...@2ndquadrant.it | www.2ndQuadrant.it diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 4ec6981..0094c19 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -270,6 +270,14 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re /row row + entrystructnamepg_stat_archiver/indextermprimarypg_stat_archiver/primary/indexterm/entry + entryOne row only, showing statistics about the + WAL archiver process's activity. See + xref linkend=pg-stat-archiver-view for details. + /entry + /row + + row entrystructnamepg_stat_bgwriter/indextermprimarypg_stat_bgwriter/primary/indexterm/entry entryOne row only, showing statistics about the background writer process's activity. See @@ -648,6 +656,64 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re /para /note + table id=pg-stat-archiver-view xreflabel=pg_stat_archiver + titlestructnamepg_stat_archiver/structname View/title + + tgroup cols=3 +thead +row + entryColumn/entry + entryType/entry + entryDescription/entry + /row +/thead + +tbody + row + entrystructfieldarchived_wals//entry + entrytypebigint/type/entry + entryNumber of WAL files that have been successfully archived/entry + /row + row + entrystructfieldlast_archived_wal//entry + entrytypetext/type/entry + entryName of the last successfully archived WAL file/entry + /row + row + entrystructfieldlast_archived_wal_time//entry + entrytypetimestamp with time zone/type/entry + entryTime of the last successful archival operation/entry + /row + row + entrystructfieldfailed_attempts//entry + entrytypebigint/type/entry + entryNumber of failed attempts for archiving WAL files/entry + /row + row + entrystructfieldlast_failed_wal//entry + entrytypetext/type/entry + entryName of the WAL file of the last failed archival operation/entry + /row + row + entrystructfieldlast_failed_wal_time//entry + entrytypetimestamp with time zone/type/entry + entryTime of the last failed archival operation/entry + /row + row + entrystructfieldstats_reset//entry + entrytypetimestamp with time zone/type/entry + entryTime at which these statistics were last reset/entry + /row +/tbody +/tgroup + /table + + para + The structnamepg_stat_archiver/structname view will always have a + single row, containing data about the archiver process of the cluster. + /para + + table id=pg-stat-bgwriter-view xreflabel=pg_stat_bgwriter titlestructnamepg_stat_bgwriter/structname View/title @@ -1613,6 +1679,8 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re argument (requires superuser privileges). Calling literalpg_stat_reset_shared('bgwriter')/ will zero all the counters shown in the structnamepg_stat_bgwriter/ view. + Calling literalpg_stat_reset_shared('archiver')/ will zero all the + counters shown in the structnamepg_stat_archiver/ view. /entry /row diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 575a40f..5ea8c87 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -672,6 +672,17 @@ CREATE VIEW pg_stat_xact_user_functions AS WHERE P.prolang != 12 -- fast check to eliminate built-in functions AND pg_stat_get_xact_function_calls(P.oid) IS NOT NULL; +CREATE VIEW pg_stat_archiver AS +SELECT +s.archived_wals, +s.last_archived_wal, +s.last_archived_wal_time, +s.failed_attempts, +s.last_failed_wal, +s.last_failed_wal_time, +s.stats_reset +FROM pg_stat_get_archiver() s; + CREATE VIEW pg_stat_bgwriter AS SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed, diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index 2bb572e..60f957c 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -36,6 +36,7 @@ #include access/xlog_internal.h #include libpq/pqsignal.h #include miscadmin.h +#include pgstat.h #include postmaster/fork_process.h #include postmaster/pgarch.h #include postmaster/postmaster.h @@ -46,6 +47,7 @@ #include storage/pmsignal.h #include utils/guc.h #include utils/ps_status.h
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
From my point of view it is not a big problem that it is not possible to place LWLock in DSM. I can allocate LWLocks in standard way - using RequestAddinLWLocks and use them for synchronization. Concerning support of huge pages - actually I do not think that it should involve something more than just setting MAP_HUGETLB flag. Allocation of correspondent number of huge pages should be done by system administrator. And what I still do not completely understand - how DSM enforces that segment created by one PosatgreSQL process will be mapped to the same virtual memory address in all other PostgreSQL processes. As far as I understand right now (with standard PostgreSQL shared memory segments) it is enforced by fork(). Shared memory segments are allocated in one process and all other processes are forked from this process inheriting this memory segments. But if new DSM segment is allocated at during execution of some query, then we should add it to virtual space of all PostgreSQL processes. Even if we somehow notify them all about presence of new segment, there is absolutely no warranty that all of them can map this segment to the specified memory address (it can be for some reasons already used by some other shared object). Or may be DSM doesn't guarantee than DSM segment is mapped to the same address in all processes? In this case it significantly complicates DSM usage: it will not be possible to use direct pointers. Can you clarify me please how dynamically allocated DSM segments will be shared by all PostgreSQL processes? On 01/05/2014 08:50 PM, Robert Haas wrote: On Sat, Jan 4, 2014 at 3:27 PM, knizhnik knizh...@garret.ru wrote: 1. I want IMCS to work with PostgreSQL versions not supporting DSM (dynamic shared memory), like 9.2, 9.3.1,... Yeah. If it's loaded at postmaster start time, then it can work with any version. On 9.4+, you could possibly make it work even if it's loaded on the fly by using the dynamic shared memory facilities. However, there are currently some limitations to those facilities that make some things you might want to do tricky. There are pending patches to lift some of these limitations. 2. IMCS is using PostgreSQL hash table implementation (ShmemInitHash, hash_search,...) May be I missed something - I just noticed DSM and have no chance to investigate it, but looks like hash table can not be allocated in DSM... It wouldn't be very difficult to write an analog of ShmemInitHash() on top of the dsm_toc patch that is currently pending. A problem, though, is that it's not currently possible to put LWLocks in dynamic shared memory, and even spinlocks will be problematic if --disable-spinlocks is used. I'm due to write a post about these problems; perhaps I should go do that. 3. IMCS is allocating memory using ShmemAlloc. In case of using DSM I have to provide own allocator (although creation of non-releasing memory allocator should not be a big issue). The dsm_toc infrastructure would solve this problem. 4. Current implementation of DSM still suffers from 256Gb problem. Certainly I can create multiple segments and so provide workaround without using huge pages, but it complicates allocator. So it sounds like DSM should also support huge pages somehow. I'm not sure what that should look like. 5. I wonder if I dynamically add new DSM segment - will it be available for other PostgreSQL processes? For example I run query which loads data in IMCS and so needs more space and allocates new DSM segment. Then another query is executed by other PostgreSQL process which tries to access this data. This process is not forked from the process created this new DSM segment, so I do not understand how this segment will be mapped to the address space of this process, preserving address... Certainly I can prohibit dynamic extension of IMCS storage (hoping that in this case there will be no such problem with DSM). But in this case we will loose the main advantage of using DSM instead of old schema of plugin's private shared memory. You can definitely dynamically add a new DSM segment; that's the point of making it *dynamic* shared memory. What's a bit tricky as things stand today is making sure that it sticks around. The current model is that the DSM segment is destroyed when the last process unmaps it. It would be easy enough to lift that limitation on systems other than Windows; we could just add a dsm_keep_until_shutdown() API or something similar. But on Windows, segments are *automatically* destroyed *by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. 6. IMCS has some configuration parameters which has to be set through postgresql.conf. So in any case user has to edit postgresql.conf file. In case of using DSM
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On 05/01/2014 18:02, Robert Haas wrote: On Sun, Jan 5, 2014 at 12:34 PM, jamesja...@mansionfamily.plus.com wrote: On 05/01/2014 16:50, Robert Haas wrote: But on Windows, segments are*automatically* destroyed*by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. Surely you just need to DuplicateHandle into the parent process? If you want to (tidily) dispose of it at some time, then you'll need to tell the postmaster that you have done so and what the handle is in its process, but if you just want it to stick around, then you can just pass it up. Uh, I don't know, maybe? Does the postmaster have to do something to receive the duplicated handle In principle, no, so long as the child has a handle to the parent process that has the appropriate permissions. Given that these processes have a parent/child relationship that shouldn't be too hard to arrange. , or can the child just throw it over the wall to the parent and let it rot until the postmaster finally exits? Yes. Though it might be a good idea to record the handle somewhere (perhaps in a table) so that any potential issues from an insane system spamming the postmaster with handles are apparent. I'm intrigued - how are the handles shared between children that are peers in the current scheme? Some handle transfer must already be in place. Could you share the handles to an immortal worker if you want to reduce any potential impact on the postmaster? The latter would be nicer for our purposes, perhaps, as running more code from within the postmaster is risky for us. If a regular backend process dies, the postmaster will restart everything and the database will come back on line, but if the postmaster itself dies, we're hard down. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] dynamic shared memory and locks
Robert Haas robertmh...@gmail.com writes: For what it's worth, my vote is currently for #2. I can't think of many interesting to do with dynamic shared memory without having at least spinlocks, so I don't think we'd be losing much. #1 seems needlessly unfriendly, #3 seems like a lot of work for not much, and #4 seems excessive at least as a solution to this particular problem, though there may be other arguments for it. What do others think? I agree with this position. There may be some good reason to drop --disable-spinlocks altogether in future, but DSM isn't a sufficient excuse. I think we're also going to want to be able to create LWLocks in dynamic shared memory: some critical sections won't be short enough or safe enough to be protected by spinlocks. At some level this seems easy: change LWLockAcquire and friends to accept an LWLock * rather than an LWLockId, and similarly change held_lwlocks[] to hold LWLock pointers rather than LWLockIds. I seem to recall that there was some good reason for keeping all the LWLocks in an array, back when the facility was first designed. I'm too lazy to research the point right now, but you might want to go back and look at the archives around when lwlock.c was written. creating a large number of new globals that will need to be initialized in every new EXEC_BACKEND process seems irritating. This might've been the good reason, but not sure --- I think LWLocks predate our Windows support. In the end, though, that decision was taken before we were concerned about being able to add new LWLocks on the fly, which is what this is really about (whether they're stored in DSM or not is a secondary point). The pressure for that has gotten strong enough that it may be time to change the tradeoff. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Support for pg_stat_archiver view
Enviado via iPhone Em 05/01/2014, às 16:27, Gabriele Bartolini gabriele.bartol...@2ndquadrant.it escreveu: Il 05/01/14 13:52, Magnus Hagander ha scritto: Yes, that's what I'm talking about. And just increment it by 1. Done. I am attaching version 2 of the patch, which now implements only one function (pg_stat_get_archiver()) and adds: * failed attempts * WAL of the last failed attempt * time of the last failed attempt Hi, I don't see your code yet, but I would like to know if is possible to implement this view as an extension. Regards, Fabrízio Mello -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] dynamic shared memory and locks
On 2014-01-05 14:06:52 -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: For what it's worth, my vote is currently for #2. I can't think of many interesting to do with dynamic shared memory without having at least spinlocks, so I don't think we'd be losing much. #1 seems needlessly unfriendly, #3 seems like a lot of work for not much, and #4 seems excessive at least as a solution to this particular problem, though there may be other arguments for it. What do others think? I agree with this position. There may be some good reason to drop --disable-spinlocks altogether in future, but DSM isn't a sufficient excuse. Agreed that DSM isn't sufficient cause. The reasons for removing it for future reasons I see are: * It's not tested at all and it has been partially broken for significants of time. Afair Heikki just noticed the breakage accidentally when adding enough new spinlocks recently. * It's showed up as problematic in a couple of patches while adding not much value (at least dsm, atomic ops, afair some others) * It's slow enough that it's not of a practical value. * Implementing simple support for spinlocks on realistic platforms isn't hard these days due to compiler intrinsics. * The platforms that don't have a barrier implementation will rely on spinlocks. And for correctness those spinlocks should employ barriers. That might be more of an argument for getting rid of that fallback tho. I think we're also going to want to be able to create LWLocks in dynamic shared memory: some critical sections won't be short enough or safe enough to be protected by spinlocks. At some level this seems easy: change LWLockAcquire and friends to accept an LWLock * rather than an LWLockId, and similarly change held_lwlocks[] to hold LWLock pointers rather than LWLockIds. I seem to recall that there was some good reason for keeping all the LWLocks in an array, back when the facility was first designed. I'm too lazy to research the point right now, but you might want to go back and look at the archives around when lwlock.c was written. Your proposal is at http://www.postgresql.org/message-id/1054.1001520...@sss.pgh.pa.us - afaics there hasn't been much discussion about implementation details at that level. In the end, though, that decision was taken before we were concerned about being able to add new LWLocks on the fly, which is what this is really about (whether they're stored in DSM or not is a secondary point). The pressure for that has gotten strong enough that it may be time to change the tradeoff. I personally find the sharing of a cacheline between a buffer headers spinlock and the lwlock much more interesting than DSM... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: I would suggest addressing Robert's concern about lack of error checking by refusing to allow a custom reloption to be set unless the relevant extension is loaded and checks it. Unlike the postgresql.conf problem, I don't see any very good use-case for allowing an unchecked ALTER TABLE to occur. How do you plan to resolve the associated dump/restore hazard? AIUI, that's why we allow people define any old this.that GUC that they want without checking it - because the relevant shared library might not be loaded at the time of definition, but only by time of use. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. 2) I have no win32 version of the logic 3) Bind mounts either have to be created by the privileged pg_regress process or have to be pre-existing on the system #1 would not be as bad if pg_regress became pg_regress_replication, as we could make the mantra into 'sudo make replicationcheck' or similar. Splitting it from 'make check' also means IMHO that it could have heavier tests that take longer to run, since people merely interested in building and installing postgres would not be impacted by this. #2 might be fixed by someone more familiar with win32 programming than I am. #3 cannot be avoided as far as I can tell, but we could chose between the two options. So far, I have chosen to set up the directory structure and add the bind mount logic to my /etc/fstab only once, rather than having this get recreated every time I invoke 'sudo make check'. The community might prefer to go the other way, and have the directories and bind mounts get set up each invocation; I have avoided that thus far as I don't want 'sudo make check' (or 'sudo make replicationcheck') to abuse its raised privileges and muck with the filesystem in a way that could cause the user unexpected problems. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment 2) Checking out the sources from git and then running ./configure make sudo make replicationtest is not particularly difficult, assuming the directories and mounts are in place, or alternatively assuming that 'sudo make regressioncheck' creates them for you if they don't already exist. Comments and advice sincerely solicited, mark
Re: [HACKERS] [PATCH] SQL assertions prototype
On 12/18/13, 2:22 PM, Andres Freund wrote: It would only force serialization for transactions that modify tables covered by the assert, that doesn't seem to bad. Anything covered by an assert shoulnd't be modified frequently, otherwise you'll run into major performance problems. I think that makes sense. If you want to use assertions, you need to run in serializable mode, otherwise you get an error if you modify anything covered by an assertion. In the future, someone could enhance this for other isolation levels, but as Josh has pointed out, that would likely just be reimplementing SSI with big locks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL assertions prototype
On 12/18/13, 2:45 PM, Andres Freund wrote: Well, the patch's syntax seems to only allow to directly specify a SQL query to check - we could iterate over the querytree to gather all related tables and reject any function we do not understand. Creating an assertion scans the query tree and attaches constraint triggers to all tables covered by the assertion condition. So it does essentially work the way you say. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL assertions prototype
On 12/18/13, 4:04 PM, Andrew Dunstan wrote: Umm, that's really a major limitation in utility. We need to come up with a better answer than this, which would essentially hobble the facility. We don't have any facility to run a trigger on just any command, it needs to be triggered by a table. Of course, that's where your on commit even triggers come in, presumably. But for example, constraint triggers support deferrability, which an on commit trigger wouldn't. We'd just need DML triggers on any/no tables. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] more psprintf() use
On 1/2/14, 9:28 AM, Tom Lane wrote: Heikki is right and you are wrong. There is an ancient supposition that datatype output functions, in particular, always return palloc'd strings. I recently got rid of the pfree's in the main output path, cf commit b006f4ddb988568081f8290fac77f9402b137120, which might explain why this patch passes regression tests; but there are still places in the code (and even more likely in third-party code) that will try to pfree the results. Well, that seems kind of dangerous. The next guys is going to write an extension that is returning string constants directly, and there is no straightforward way to detect this problem. Perhaps we should have some mode similar to the CLOBBER and COPY_*_TREES symbols to force a pfree() in assertion-enabled builds? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] more psprintf() use
On 1/2/14, 2:12 PM, Alvaro Herrera wrote: Peter Eisentraut wrote: psprintf() in place of hardcoded palloc(N) + sprintf() and the like. +values[j++] = psprintf(%d, stat.blkno); +values[j++] = psprintf(%c, stat.type); +values[j++] = psprintf(%d, stat.live_items); +values[j++] = psprintf(%d, stat.dead_items); +values[j++] = psprintf(%d, stat.avg_item_size); +values[j++] = psprintf(%d, stat.page_size); +values[j++] = psprintf(%d, stat.free_size); +values[j++] = psprintf(%d, stat.btpo_prev); +values[j++] = psprintf(%d, stat.btpo_next); +values[j++] = psprintf(%d, (stat.type == 'd') ? stat.btpo.xact : stat.btpo.level); +values[j++] = psprintf(%d, stat.btpo_flags); tuple = BuildTupleFromCStrings(TupleDescGetAttInMetadata(tupleDesc), values); In cases such as this one, I have often wondered whether it'd be better to write this as DatumGetSometype() plus heap_form_tuple, instead of printing to strings and then building a tuple from those. Probably. As you can see, this style is only used in a few contrib modules that all came from the same source, I think. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST support for inet datatypes
2013-12-17 Emre Hasegeli e...@hasegeli.com: Query planner never chooses to use the index for the operators which the index is particularly useful because selectivity estimation functions are missing. I am planning to work on them. Attached patch adds selectivity estimation functions for the overlap and adjacent operators. Other operators need a bit more work. I want to send it before to get some feedback. inet-selfuncs-v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] more psprintf() use
Peter Eisentraut pete...@gmx.net writes: On 1/2/14, 9:28 AM, Tom Lane wrote: Heikki is right and you are wrong. There is an ancient supposition that datatype output functions, in particular, always return palloc'd strings. I recently got rid of the pfree's in the main output path, cf commit b006f4ddb988568081f8290fac77f9402b137120, which might explain why this patch passes regression tests; but there are still places in the code (and even more likely in third-party code) that will try to pfree the results. Well, that seems kind of dangerous. The next guys is going to write an extension that is returning string constants directly, and there is no straightforward way to detect this problem. Perhaps we should have some mode similar to the CLOBBER and COPY_*_TREES symbols to force a pfree() in assertion-enabled builds? Seems kinda backwards. If we want to put any effort into this issue, it'd be better to head in the direction of making the world safe for output functions to return constants, ie deprecate rather than enforce the practice of pfree'ing their results. But see http://www.postgresql.org/message-id/12646.1383420...@sss.pgh.pa.us regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Robert Haas robertmh...@gmail.com writes: On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: I would suggest addressing Robert's concern about lack of error checking by refusing to allow a custom reloption to be set unless the relevant extension is loaded and checks it. Unlike the postgresql.conf problem, I don't see any very good use-case for allowing an unchecked ALTER TABLE to occur. How do you plan to resolve the associated dump/restore hazard? pg_dump creates extensions before tables, no? So what dump/restore hazard? AIUI, that's why we allow people define any old this.that GUC that they want without checking it - because the relevant shared library might not be loaded at the time of definition, but only by time of use. No, the reason we allow GUCs to be set before the relevant library is loaded is so that you can put a setting into postgresql.conf without thereby having to make the extension be load-into-postmaster. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SQL assertions prototype
Peter Eisentraut pete...@gmx.net schrieb: On 12/18/13, 2:22 PM, Andres Freund wrote: It would only force serialization for transactions that modify tables covered by the assert, that doesn't seem to bad. Anything covered by an assert shoulnd't be modified frequently, otherwise you'll run into major performance problems. I think that makes sense. If you want to use assertions, you need to run in serializable mode, otherwise you get an error if you modify anything covered by an assertion. In the future, someone could enhance this for other isolation levels, but as Josh has pointed out, that would likely just be reimplementing SSI with big locks. SSI only actually works correctly if all transactions use SSI... I am not sure if we can guarantee that the subset we'd require'd be safe without the read sie using SSI. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
-- greg On 5 Jan 2014 14:54, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. This sounds pretty cool. The real trick will be in testing concurrent behaviour -- I.e. queries on the slave when it's replaying logs at a certain point. But right now we have nothing so anything would be an improvement. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. This isn't necessary. You can use the same binaries and run initdb with a different location just fine. Then start up the database with -D to specify the directory.
Re: [HACKERS] In-core regression tests for replication, cascading, archiving, PITR, etc. Michael Paquier
On Mon, Jan 6, 2014 at 4:51 AM, Mark Dilger markdil...@yahoo.com wrote: I am building a regression test system for replication and came across this email thread. I have gotten pretty far into my implementation, but would be happy to make modifications if folks have improvements to suggest. If the community likes my design, or a modified version based on your feedback, I'd be happy to submit a patch. Yeah, this would be nice to look at, core code definitely needs to have some more infrastructure for such a test suite. I didn't get the time to go back to it since I began this thread though :) Currently I am canibalizing src/test/pg_regress.c, but that could instead be copied to src/test/pg_regress_replication.c or whatever. The regression test creates and configures multiple database clusters, sets up the replication configuration for them, runs them each in nonprivileged mode and bound to different ports, feeds all the existing 141 regression tests into the master database with the usual checking that all the right results are obtained, and then checks that the standbys have the expected data. This is possible all on one system because the database clusters are chroot'ed to see their own /data directory and not the /data directory of the other chroot'ed clusters, although the rest of the system, like /bin and /etc and /dev are all bind mounted and visible to each cluster. Having vanilla regressions run in a cluster with multiple nodes and check the results on a standby is the top of the iceberg though. What I had in mind when I began this thread was to have more than a copy/paste of pg_regress, but an infrastructure that people could use to create and customize tests by having an additional control layer on the cluster itself. For example, testing replication is not only a matter of creating and setting up the nodes, but you might want to be able to initialize, add, remove nodes during the tests. Node addition would be either a new fresh master (this would be damn useful for a test suite for logical replication I think), or a slave node with custom recovery parameters to test replication, as well as PITR, archiving, etc. Then you need to be able to run SQL commands on top of that to check if the results are consistent with what you want. A possible input for a test that users could provide would be something like that: # Node information for tests nodes { {node1, postgresql.conf params, recovery.conf params} {node2, postgresql.conf params, recovery.conf params, slave of node1} } # Run test init node1 run_sql node1 file1.sql # Check output init node2 run_sql node2 file2.sql # Check that results are fine # Process The main problem is actually how to do that. Having some smart shell infrastructure would be simple and would facilitate (?) the maintenance of code used to run the tests. On the contrary having a C program would make the maintenance of code to run the tests more difficult (?) for a trade with more readable test suite input like the one I wrote above. This might also make the test input more readable for a human eye, in the shape of what is already available in src/test/isolation. Another possibility could be also to integrate directly a recovery/backup manager in PG core, and have some tests for it, or even include those tests directly with pg_basebackup or an upper layer of it. There of course is room to add as many replication tests as you like, and the main 141 tests fed into the master could be extended to feed more data and such. The main drawbacks that I don't care for are: 1) 'make check' becomes 'sudo make check' because it needs permission to run chroot. -1 for that developers should not need to use root to run regression suite. 2) I have no win32 version of the logic For a first shot I am not sure that it matters much. The main advantages that I like about this design are: 1) Only one system is required. The developer does not need network access to a second replication system. Moreover, multiple database clusters can be established with interesting replication hierarchies between them, and the cost of each additional cluster is just another chroot environment An assumption of the test suite is I think to allow developers to check for bugs on a local server only. This facilitates how the test suite is written and you don't need to enter in things like VM settings or cross-environment tests, things that could be done already nicely by frameworks of the type Jenkins. What I think people would like to have is that: cd src/test/replication make check/installcheck And have the test run for them. Regards, -- Michael
Re: [HACKERS] dynamic shared memory and locks
On Sun, Jan 5, 2014 at 2:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: I seem to recall that there was some good reason for keeping all the LWLocks in an array, back when the facility was first designed. I'm too lazy to research the point right now, but you might want to go back and look at the archives around when lwlock.c was written. To some extent it's an orthogonal question. It's true that there isn't much point in using LWLock * rather than LWLockId to refer to LWLocks unless we wish to be able to store them outside the shared memory segment, but the reverse is not true: just because we have the ability to move things outside the main array in the general case doesn't make it a good idea in any particular case. Andres's email seems to indicate that he sees performance advantages in moving buffer locks elsewhere, and I have a sneaking suspicion that we'll find that it's more convenient to move some other things around as well, but that's policy, not mechanism. Very little of the core LWLock machinery actually cares how the locks are stored, so the attached patch to make it use LWLock * rather than LWLockId as a handle is pretty straightforward. The only real problem I see here is that we occasionally *print out* LWLockIds as a way of identifying lwlocks. This is not a great system, but printing out pointers is almost certainly much worse (e.g. because of ASLR). The cases where this is currently an issue are: - You try to release a lwlock you haven't acquired. We elog(ERROR) the ID. - You define LWLOCK_STATS. The resulting reports are print the lock ID. - You define LOCK_DEBUG and set trace_lwlocks=true. We print the lock ID in the trace messages. - You compile with --enable-dtrace. We pass the lock IDs to the dtrace probes. In the attached patch I handled the first case by printing the pointer (which I don't feel too bad about) and the remaining three cases by leaving them broken. I wouldn't shed a tear about ripping out trace_lwlocks, but LWLOCK_STATS is useful and I bet somebody is using --enable-dtrace, so we probably need to fix those cases at least. I suppose one option is to make LWLOCK_STATS and the dtrace probes only look at locks in the main array and just ignore everything else. But that's kind of crappy, especially if we're going to soon move buffer locks out of the main array. Another idea is to include some identifying information in the lwlock. For example, each lwlock could have a char *name in it, and we could print the name. In theory, this could be a big step forward in terms of usability, because it'd spare us all needing to remember that 4 == ProcArrayLock. But it's awkward for buffer locks, of which there might be a great many, and we surely don't want to allocate a dynamically-generated string in shared memory for each one. You could do a bit better by making the identifying information a string plus an integer, because then all the buffer locks could set the string to a static constant like buffer content lock and the integer to the buffer number, and similarly for lock manager partition locks and so on. This is appealing, but would increase the size of LWLockPadded from 16 bytes to 32 on 64-bit platforms where slock_t is four bytes or less, which I'm not that excited about even though it would reduce cache line contention in some cases. Yet a third idea is to try to reverse-engineer a name for a given lwlock from the pointer address. If it's an offset into the main array, this is easy enough to do, and even if we ended up with several arrays (like one for bufmgr locks) it wouldn't be too hard to write code to figure out which array contains it and emit the appropriate string. The only real problem that I see with this is that it might cause a performance hit. A performance hit when running with trace_lwlocks or LWLOCK_STATS is not really a problem, but people won't like if --enable-dtrace slow things up. Preferences, other ideas? None of these ideas are a complete solution for LWLOCK_STATS. In the other three cases noted above, we only need an identifier for the lock instantaneously, so that we can pass it off to the logger or dtrace or whatever. But LWLOCK_STATS wants to hold on to data about the locks that were visited until the end of the session, and it does that using an array that is *indexed* by lwlockid. I guess we could replace that with a hash table. Ugh. Any suggestions? (Incidentally, while developing this patch I found a bug in the current code: lock.c iterates over all PGPROCs from 0 to ProcGlobal-allProcCount and takes the backendLock for each, but if max_prepared_transactions0 then the PGPROCs for prepared transactions do not have a backendLock and so we take and release BufFreelistLock - i.e. 0 - a number of times equal to max_prepared_transactions. That's not cool.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/access/transam/slru.c
Re: [HACKERS] [PATCH] Store Extension Options
On Sun, Jan 5, 2014 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: I would suggest addressing Robert's concern about lack of error checking by refusing to allow a custom reloption to be set unless the relevant extension is loaded and checks it. Unlike the postgresql.conf problem, I don't see any very good use-case for allowing an unchecked ALTER TABLE to occur. How do you plan to resolve the associated dump/restore hazard? pg_dump creates extensions before tables, no? So what dump/restore hazard? Creating the extension doesn't guarantee that the shared library will always be loaded. If nothing else, think about partial restores. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On Sun, Jan 5, 2014 at 1:28 PM, knizhnik knizh...@garret.ru wrote: From my point of view it is not a big problem that it is not possible to place LWLock in DSM. I can allocate LWLocks in standard way - using RequestAddinLWLocks and use them for synchronization. Sure, well, that works fine if you're being loaded from shared_preload_libraries. If you want to be able to load the extension after startup time, though, it's no good. And what I still do not completely understand - how DSM enforces that segment created by one PosatgreSQL process will be mapped to the same virtual memory address in all other PostgreSQL processes. It doesn't. One process calls dsm_create() to create a shared memory segment. Other processes call dsm_attach() to attach it. There's no guarantee that they'll map it at the same address; they'll just map it somewhere. Or may be DSM doesn't guarantee than DSM segment is mapped to the same address in all processes? In this case it significantly complicates DSM usage: it will not be possible to use direct pointers. Yeah, that's where we're at. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On Sun, Jan 5, 2014 at 1:44 PM, james ja...@mansionfamily.plus.com wrote: I'm intrigued - how are the handles shared between children that are peers in the current scheme? Some handle transfer must already be in place. That's up to the application. After calling dsm_create(), you call dsm_segment_handle() to get the 32-bit integer handle for that segment. Then you have to get that to the other process(es) somehow. If you're trying to share a handle with a background worker, you can stuff it in bgw_main_arg. Otherwise, you'll probably need to store it in the main shared memory segment, or a file, or whatever. Could you share the handles to an immortal worker if you want to reduce any potential impact on the postmaster? You could, but this seems like this justification for spawning another process, and how immortal is that worker really? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
On Mon, Jan 6, 2014 at 1:08 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jan 5, 2014 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jan 4, 2014 at 1:00 PM, Tom Lane t...@sss.pgh.pa.us wrote: I would suggest addressing Robert's concern about lack of error checking by refusing to allow a custom reloption to be set unless the relevant extension is loaded and checks it. Unlike the postgresql.conf problem, I don't see any very good use-case for allowing an unchecked ALTER TABLE to occur. How do you plan to resolve the associated dump/restore hazard? pg_dump creates extensions before tables, no? So what dump/restore hazard? Creating the extension doesn't guarantee that the shared library will always be loaded. If nothing else, think about partial restores. You are correct. pg_dump export reloptions using WITH clause of CREATE TABLE statement. I.e.: CREATE TABLE foo ( ) WITH (autovacuum_enabled=false, bdr.do_replicate=false); So if this statement checks for 'bdr' extension is loaded then in partial restore it can be fail. At this point we have two choices: 1) do not check if extension already is loaded 2) hack the pg_dump to produce an ALTER TABLE ... SET (...) instead of CREATE TABLE ... WITH (...) to set reloptions Comments? -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog sobre TI: http://fabriziomello.blogspot.com Perfil Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello
[HACKERS] Compiling extensions on Windows
Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? I'm interested in making the Windows installer distributions a bit more extension dev friendly. In particular, I'd really like to see EDB's Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but it's an unnecessary roadblock. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
Robert Haas robertmh...@gmail.com writes: On Sun, Jan 5, 2014 at 3:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: pg_dump creates extensions before tables, no? So what dump/restore hazard? Creating the extension doesn't guarantee that the shared library will always be loaded. No, but unless the plan is that no validation happens at all (which I gather is not your desire) then there must be some mechanism for figuring out which extension owns a given reloption and asking it to validate the value. This might be more complicated than a passive hook, but I don't feel bad about demanding that it work like that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Store Extension Options
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com writes: You are correct. pg_dump export reloptions using WITH clause of CREATE TABLE statement. I.e.: CREATE TABLE foo ( ) WITH (autovacuum_enabled=false, bdr.do_replicate=false); So if this statement checks for 'bdr' extension is loaded then in partial restore it can be fail. I see absolutely *nothing* wrong with failing that command if bdr is not installed. For an analogy, if this table includes a column of type bar defined by some extension baz, we are certainly going to fail the CREATE TABLE if baz isn't installed. Now, if bdr is installed but the validation doesn't happen unless bdr is loaded in some sense, then that is an implementation deficiency that I think we can insist be rectified before this feature is accepted. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Compiling extensions on Windows
On 01/05/2014 10:32 PM, Craig Ringer wrote: Hi all Out of personal interest (in pain and suffering) I was recently looking into how to compile extensions out-of-tree on Windows using Visual Studio (i.e. no PGXS). It looks like the conventional answer to this is Do a source build of PG, compile your ext in-tree in contrib/, and hope the result is binary compatible with release PostgreSQL builds for Windows. Certainly that's how I've been doing it to date. How about everyone else here? Does anyone actually build and distribute extensions out of tree at all? I'm interested in making the Windows installer distributions a bit more extension dev friendly. In particular, I'd really like to see EDB's Windows installers include the libintl.h for the included libintl, since its omission, combined with Pg being built with ENABLE_NLS, tends to break things horribly. Users can always undefine ENABLE_NLS, but it's an unnecessary roadblock. Are there any objections from -hackers to including 3rd party headers for libs we expose in our public headers in the binary distribution? Other than bundling 3rd party headers, any ideas/suggestions for how we might make ext building saner on Windows? If you're bundling a DLL then I don't see why the corresponding header file shouldn't be included also. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
On Sun, Jan 5, 2014 at 11:04 PM, james ja...@mansionfamily.plus.com wrote: On 05/01/2014 16:50, Robert Haas wrote: But on Windows, segments are *automatically* destroyed *by the operating system* when the last process unmaps them, so it's not quite so clear to me how we can allow it there. The main shared memory segment is no problem because the postmaster always has it mapped, even if no one else does, but that doesn't help for dynamic shared memory segments. Surely you just need to DuplicateHandle into the parent process? Ideally DuplicateHandle should work, but while going through Windows internals of shared memory functions on below link, I observed that they mentioned it that it will work for child proceess. http://msdn.microsoft.com/en-us/library/ms810613.aspx Refer section Inheriting and duplicating memory-mapped file object handles If you want to (tidily) dispose of it at some time, then you'll need to tell the postmaster that you have done so and what the handle is in its process, but if you just want it to stick around, then you can just pass it up. Duplicate handle should work, but we need to communicate the handle to other process using IPC. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RFC: Async query processing
On Fri, Jan 03, 2014 at 03:06:11PM -0200, Claudio Freire wrote: On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: Claudio Freire klaussfre...@gmail.com writes: On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer fwei...@redhat.com wrote: Loading data into the database isn't such an uncommon task. Not everything is OLTP. Truly, but a sustained insert stream of 10 Mbps is certainly way beyond common non-OLTP loads. This is far more specific than non-OLTP. I think Florian has a good point there, and the reason is this: what you are talking about will be of exactly zero use to applications that want to see the results of one query before launching the next. Which eliminates a whole lot of apps. I suspect that almost the *only* common use case in which a stream of queries can be launched without feedback is going to be bulk data loading. It's not clear at all that pipelining the PQexec code path is the way to better performance for that --- why not use COPY, instead? You're forgetting ORM workloads. ORMs can usually plan the inserts to be in a sequence that both don't require feedback (except the knowledge that they were successful), and that do not violate constraints. Flushing a whole object hierarchy for instance, can be done without feedback. Not even serial columns need feedback, since many ORMs (SQLAlchemy, Hibernate) support allocation of ID sequences in batches (by issuing a proper select nextval). This is already doable using data-modifying WITH clauses. Whether ORMs are (or should be made) smart enough to take advantage of this is probably out of the scope of this discussion, though. For what it's worth, I'm not in favor of orthogonality in languages, so +1 from me. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers