Re: Postgres with pthread
On 27.12.2017 13:08, Andres Freund wrote: On December 27, 2017 11:05:52 AM GMT+01:00, jameswrote: All threads are blocked in semaphores. That they are blocked is inevitable - I guess the issue is that they are thrashing. I guess it would be necessary to separate the internals to have some internal queueing and effectively reduce the number of actively executing threads. In effect make the connection pooling work internally. Would it be possible to make the caches have persistent (functional) data structures - effectively CoW? And how easy would it be to abort if the master view had subsequently changed when it comes to execution? Optimizing for this seems like a pointless exercise. If the goal is efficient processing of 100k connections the solution is a session / connection abstraction and a scheduler. Optimizing for this amount of concurrency just will add complexity and slowdowns for a workload that nobody will run. I agree with you that supporting 100k active connections has not so much practical sense now. But there are many systems with hundreds of cores and to utilize them we still need spawn thousands of backends. In this case Postgres snaphots and local caches becomes inefficient. Switching to CSN allows to somehow solve the problem with snapshots. But the problems with private caches should also be addressed: it seems to be very stupid to perform the same work 1000x times and maintain 1000x copies. Also, in case of global prepared statements, presence of global cache allows to spend more time in plan optimization use manual tuning. Switching to pthreads model significantly simplify development of shared caches: there are no problems with statically allocated shared address space or dynamic segments mapped on different address, not allowing to use normal pointer. Also invalidation of shared cache is easier: on need to send invalidation notifications to all backends. But still it requires a lot of work. For example catalog cache is tightly integrated with resource owner's information. Also shared cache requires synchronization and this synchronization itself can become a bottleneck. Andres -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Postgres with pthread
On December 27, 2017 11:05:52 AM GMT+01:00, jameswrote: > > All threads are blocked in semaphores. >That they are blocked is inevitable - I guess the issue is that they >are >thrashing. >I guess it would be necessary to separate the internals to have some >internal queueing and effectively reduce the number of actively >executing threads. >In effect make the connection pooling work internally. > >Would it be possible to make the caches have persistent (functional) >data structures - effectively CoW? > >And how easy would it be to abort if the master view had subsequently >changed when it comes to execution? Optimizing for this seems like a pointless exercise. If the goal is efficient processing of 100k connections the solution is a session / connection abstraction and a scheduler. Optimizing for this amount of concurrency just will add complexity and slowdowns for a workload that nobody will run. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: Postgres with pthread
> All threads are blocked in semaphores. That they are blocked is inevitable - I guess the issue is that they are thrashing. I guess it would be necessary to separate the internals to have some internal queueing and effectively reduce the number of actively executing threads. In effect make the connection pooling work internally. Would it be possible to make the caches have persistent (functional) data structures - effectively CoW? And how easy would it be to abort if the master view had subsequently changed when it comes to execution?
Re: Postgres with pthread
On 21.12.2017 16:25, Konstantin Knizhnik wrote: I continue experiments with my pthread prototype. Latest results are the following: 1. I have eliminated all (I hope) calls of non-reentrant functions (getopt, setlocale, setitimer, localtime, ...). So now parallel tests are passed. 2. I have implemented deallocation of top memory context (at thread exit) and cleanup of all opened file descriptors. I have to replace several place where malloc is used with top_malloc: allocation in top context. 3. Now my prototype is passing all regression tests now. But handling of errors is still far from completion. 4. I have performed experiments with replacing synchronization primitives used in Postgres with pthread analogues. Unfortunately it has almost now influence on performance. 5. Handling large number of connections. The maximal number of postgres connections is almost the same: 100k. But memory footprint in case of pthreads was significantly smaller: 18Gb vs 38Gb. And difference in performance was much higher: 60k TPS vs . 600k TPS. Compare it with performance for 10k clients: 1300k TPS. It is read-only pgbench -S test with 1000 connections. As far as pgbench doesn't allow to specify more than 1000 clients, I spawned several instances of pgbench. Why handling large number of connections is important? It allows applications to access postgres directly, not using pgbouncer or any other external connection pooling tool. In this case an application can use prepared statements which can reduce speed of simple queries almost twice. Unfortunately Postgres sessions are not lightweight. Each backend maintains its private catalog and relation caches, prepared statement cache,... For real database size of this caches in memory will be several megabytes and warming this caches can take significant amount of time. So if we really want to support large number of connections, we should rewrite caches to be global (shared). It will allow to save a lot of memory but add synchronization overhead. Also at NUMA private caches may be more efficient than one global cache. My proptotype can be found at: git://github.com/postgrespro/postgresql.pthreads.git Finally I managed to run Postgres with 100k active connections. Not sure that this result can pretend for been mentioned in Guiness records, but I am almost sure that nobody has done it before (at least with original version of Postgres). But it was really "Pyrrhic victory". Performance for 100k connections is 1000 times slower than for 10k. All threads are blocked in semaphores. This is more or less expected result, but still scale of degradation is impressive: #Connections TPS 100k 550 10k 558k 6k 745k 4k 882k 2k 1100k 1k 1300k As it is clear from this stacktraces, shared catalog and statement cache are highly needed to provide good performance with such large number of active backends: (gdb) thread apply all bt Thread 17807 (LWP 660863): #0 0x7f4c1cb46576 in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x7f4c1cb46668 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00697a32 in PGSemaphoreLock () #3 0x00702a64 in LWLockAcquire () #4 0x006fbf2d in LockAcquireExtended () #5 0x006f9fa3 in LockRelationOid () #6 0x004b2ffd in relation_open () #7 0x004b31d6 in heap_open () #8 0x007f1ed1 in CatalogCacheInitializeCache () #9 0x007f3835 in SearchCatCache1 () #10 0x00800510 in get_tablespace () #11 0x008006e1 in get_tablespace_page_costs () #12 0x0065a4e1 in cost_seqscan () #13 0x0068bf92 in create_seqscan_path () #14 0x006568b4 in set_rel_pathlist () #15 0x00656eb8 in make_one_rel () #16 0x006740d0 in query_planner () #17 0x00676526 in grouping_planner () #18 0x00679812 in subquery_planner () #19 0x0067a66c in standard_planner () #20 0x0070ffe1 in pg_plan_query () #21 0x007100b6 in pg_plan_queries () #22 0x007f6c6f in BuildCachedPlan () #23 0x007f6e5c in GetCachedPlan () #24 0x00711ccf in PostgresMain () #25 0x006a5535 in backend_main_proc () #26 0x006a353d in thread_trampoline () #27 0x7f4c1cb3d36d in start_thread () from /lib64/libpthread.so.0 #28 0x7f4c1c153b8f in clone () from /lib64/libc.so.6 Thread 17806 (LWP 660861): #0 0x7f4c1cb46576 in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x7f4c1cb46668 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00697a32 in PGSemaphoreLock () #3 0x00702a64 in LWLockAcquire () #4 0x006fbf2d in LockAcquireExtended () #5 0x006f9fa3 in LockRelationOid () #6 0x004b2ffd in relation_open () #7 0x004b31d6 in heap_open () #8 0x007f1ed1 in CatalogCacheInitializeCache () #9
Re: Postgres with pthread
2017-12-21 14:25 GMT+01:00 Konstantin Knizhnik: > I continue experiments with my pthread prototype. > Latest results are the following: > > 1. I have eliminated all (I hope) calls of non-reentrant functions > (getopt, setlocale, setitimer, localtime, ...). So now parallel tests are > passed. > > 2. I have implemented deallocation of top memory context (at thread exit) > and cleanup of all opened file descriptors. > I have to replace several place where malloc is used with top_malloc: > allocation in top context. > > 3. Now my prototype is passing all regression tests now. But handling of > errors is still far from completion. > > 4. I have performed experiments with replacing synchronization primitives > used in Postgres with pthread analogues. > Unfortunately it has almost now influence on performance. > > 5. Handling large number of connections. > The maximal number of postgres connections is almost the same: 100k. > But memory footprint in case of pthreads was significantly smaller: 18Gb > vs 38Gb. > And difference in performance was much higher: 60k TPS vs . 600k TPS. > Compare it with performance for 10k clients: 1300k TPS. > It is read-only pgbench -S test with 1000 connections. > As far as pgbench doesn't allow to specify more than 1000 clients, I > spawned several instances of pgbench. > > Why handling large number of connections is important? > It allows applications to access postgres directly, not using pgbouncer or > any other external connection pooling tool. > In this case an application can use prepared statements which can reduce > speed of simple queries almost twice. > What I know MySQL has not good experience with high number of threads - and there is thread pool in enterprise (and now in Mariadb0 versions. Regards Pavel > Unfortunately Postgres sessions are not lightweight. Each backend > maintains its private catalog and relation caches, prepared statement > cache,... > For real database size of this caches in memory will be several megabytes > and warming this caches can take significant amount of time. > So if we really want to support large number of connections, we should > rewrite caches to be global (shared). > It will allow to save a lot of memory but add synchronization overhead. > Also at NUMA private caches may be more efficient than one global cache. > > My proptotype can be found at: git://github.com/postgrespro/p > ostgresql.pthreads.git > > > -- > > Konstantin Knizhnik > Postgres Professional: http://www.postgrespro.com > The Russian Postgres Company > > >
Re: Postgres with pthread
I continue experiments with my pthread prototype. Latest results are the following: 1. I have eliminated all (I hope) calls of non-reentrant functions (getopt, setlocale, setitimer, localtime, ...). So now parallel tests are passed. 2. I have implemented deallocation of top memory context (at thread exit) and cleanup of all opened file descriptors. I have to replace several place where malloc is used with top_malloc: allocation in top context. 3. Now my prototype is passing all regression tests now. But handling of errors is still far from completion. 4. I have performed experiments with replacing synchronization primitives used in Postgres with pthread analogues. Unfortunately it has almost now influence on performance. 5. Handling large number of connections. The maximal number of postgres connections is almost the same: 100k. But memory footprint in case of pthreads was significantly smaller: 18Gb vs 38Gb. And difference in performance was much higher: 60k TPS vs . 600k TPS. Compare it with performance for 10k clients: 1300k TPS. It is read-only pgbench -S test with 1000 connections. As far as pgbench doesn't allow to specify more than 1000 clients, I spawned several instances of pgbench. Why handling large number of connections is important? It allows applications to access postgres directly, not using pgbouncer or any other external connection pooling tool. In this case an application can use prepared statements which can reduce speed of simple queries almost twice. Unfortunately Postgres sessions are not lightweight. Each backend maintains its private catalog and relation caches, prepared statement cache,... For real database size of this caches in memory will be several megabytes and warming this caches can take significant amount of time. So if we really want to support large number of connections, we should rewrite caches to be global (shared). It will allow to save a lot of memory but add synchronization overhead. Also at NUMA private caches may be more efficient than one global cache. My proptotype can be found at: git://github.com/postgrespro/postgresql.pthreads.git -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Postgres with pthread
On 06/12/2017 17:26, Andreas Karlsson wrote: An additional issue is that this could break a lot of extensions and in a way that it is not apparent at compile time. This means we may need to break all extensions to force extensions authors to check if they are thread safe. I do not like making life hard for out extension community, but if the gains are big enough it might be worth it. It seems to me that the counter-argument is that extensions that naturally support threading will benefit. For example it may be a lot more practical to have CLR or JVM extensions.
Re: Postgres with pthread
On Sat, Dec 9, 2017 at 1:09 AM, konstantin knizhnik < k.knizh...@postgrespro.ru> wrote: > I am not going to show stack traces of all 1000 threads. > But you may notice that proc array lock really seems be be a bottleneck. > Yes, proc array lock easily becomes a bottleneck on multicore machine with large number of connections. Related to this, another patch helping to large number of connections is CSN. When our snapshot model was invented, xip was just array of few elements, and that cause no problem. Now, we're considering threads to help us handling thousands of connections. Snapshot with thousands of xips looks ridiculous. Collecting such a large snapshot could be more expensive than single index lookup. These two patches threads and CSN are both complicated and require hard work during multiple release cycles to get committed. But I really hope that their cumulative effect can dramatically improve situation on high number of connections. There are already some promising benchmarks in CSN thread. I wonder if we already can do some cumulative benchmarks of threads + CSN? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Postgres with pthread
On Dec 7, 2017, at 10:41 AM, Simon Riggs wrote: >> But it is a theory. The main idea of this prototype was to prove or disprove >> this expectation at practice. > >> But please notice that it is very raw prototype. A lot of stuff is not >> working yet. > >> And supporting all of exited Postgres functionality requires >> much more efforts (and even more efforts are needed for optimizing Postgres >> for this architecture). >> >> I just want to receive some feedback and know if community is interested in >> any further work in this direction. > > Looks good. You are right, it is a theory. If your prototype does > actually show what we think it does then it is a good and interesting > result. > > I think we need careful analysis to show where these exact gains come > from. The actual benefit is likely not evenly distributed across the > list of possible benefits. Did they arise because you produced a > stripped down version of Postgres? Or did they arise from using > threads? > > It would not be the first time a result shown in protoype did not show > real gains on a completed project. > > I might also read your results to show that connection concentrators > would be a better area of work, since 100 connections perform better > than 1000 in both cases, so why bother optimising for 1000 connections > at all? In which case we should read the benefit at the 100 > connections line, where it shows the lower 28% gain, closer to the > gain your colleague reported. > > So I think we don't yet have enough to make a decision. Concerning optimal number of connection: one of my intentions was to eliminate meed in external connection pool (pgbouncer). In this case applications can use prepared statements which itself provides two times increase of performance. I believe that threads have smaller footprint than processes, to it is possible to spawn more threads and directly access them without intermediate layer with connection pooling. I have performed experiments at more power server: 144 virtual cores Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz. Here results of read-only queries are different: both pthreads and vanilla version shows almost the same speed both for 100 and 1000 connections: about 1300k TPS with prepared statement. So there is no performance degradation with increased number of connections and no larger difference between processes and threads. But at read-write workload (pgbench -N) there is still significant advantage of pthreads version (kTPS): Connections Vanilla pthreads 100 165 154 1000 85 118 For some reasons (which I do not know yet) multiprocess version of postgres is slightly faster for 100 connections, but degrades almost twice for 1000 connections, while degradation of multithreads version is not so large. By the way, pthreads version make it possible to much easily check whats going on using gdb (manual "profiling") : thread apply all bt Thread 997 (Thread 0x7f6e08810700 (LWP 61345)): #0 0x7f7e03263576 in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x7f7e03263668 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00698552 in PGSemaphoreLock () #3 0x00702804 in LWLockAcquire () #4 0x004f9ac4 in XLogInsertRecord () #5 0x00503b97 in XLogInsert () #6 0x004bb0d1 in log_heap_clean () #7 0x004bd7c8 in heap_page_prune () #8 0x004bd9c1 in heap_page_prune_opt () ---Type to continue, or q to quit--- #9 0x004c43d4 in index_fetch_heap () #10 0x004c4410 in index_getnext () #11 0x006037d2 in IndexNext () #12 0x005f3a80 in ExecScan () #13 0x00609eba in ExecModifyTable () #14 0x005ed6fa in standard_ExecutorRun () #15 0x00713622 in ProcessQuery () #16 0x00713885 in PortalRunMulti () #17 0x007143a5 in PortalRun () #18 0x00711cf1 in PostgresMain () #19 0x006a708b in backend_main_proc () #20 0x7f7e0325a36d in start_thread () from /lib64/libpthread.so.0 #21 0x7f7e02870b8f in clone () from /lib64/libc.so.6 Thread 996 (Thread 0x7f6e08891700 (LWP 61344)): #0 0x7f7e03263576 in do_futex_wait.constprop () from /lib64/libpthread.so.0 #1 0x7f7e03263668 in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00698552 in PGSemaphoreLock () #3 0x00702804 in LWLockAcquire () #4 0x004bc862 in RelationGetBufferForTuple () #5 0x004b60db in heap_insert () #6 0x0060ad3b in ExecModifyTable () #7 0x005ed6fa in standard_ExecutorRun () #8 0x00713622 in ProcessQuery () #9 0x00713885 in PortalRunMulti () #10 0x007143a5 in PortalRun () #11 0x00711cf1 in PostgresMain () #12 0x006a708b in backend_main_proc () #13 0x7f7e0325a36d in start_thread () from /lib64/libpthread.so.0 #14 0x7f7e02870b8f in clone () from /lib64/libc.so.6 Thread 995 (Thread 0x7f6e08912700 (LWP 61343)):
Re: Postgres with pthread
On 8 December 2017 at 03:58, Andres Freundwrote: > On 2017-12-07 11:26:07 +0800, Craig Ringer wrote: > > PostgreSQL's architecture conflates "connection", "session" and > "executor" > > into one somewhat muddled mess. > > How is the executor entangled in the other two? > > Executor in the postgres sense isn't, so I chose the word poorly. "Engine of execution" maybe. What I'm getting at is that we tie up more resources than should ideally be necessary when a session is idle, especially idle in transaction. But I guess a lot of that is really down to memory allocated and not returned to the OS (because like other C programs we can't do that), etc. The key resources like PGXACT entries aren't something we can release while idle in a transaction after all. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Postgres with pthread
Hi, On 2017-12-07 20:48:06 +, Greg Stark wrote: > But then I thought about it a bit and I do wonder. I don't know how > well we test having multiple portals doing all kinds of different > query plans with their execution interleaved. Cursors test that pretty well. > And I definitely have doubts whether you can start SPI sessions from > arbitrary points in the executor expression evaluation and don't know > what state you can leave and resume them from on subsequent > evaluations... SPI being weird doesn't really have that much bearing on the executor structure imo. But I'm unclear what you'd use SPI for that really necessitates that. We don't suspend execution it the middle of function execution... Greetings, Andres Freund
Re: Postgres with pthread
On 2017-12-07 11:26:07 +0800, Craig Ringer wrote: > PostgreSQL's architecture conflates "connection", "session" and "executor" > into one somewhat muddled mess. How is the executor entangled in the other two? Greetings, Andres Freund
Re: Postgres with pthread
On Wed, Dec 6, 2017 at 10:20 PM, Craig Ringerwrote: > Personally I think it's a pity we didn't land up here before the foundations > for parallel query went in - DSM, shm_mq, DSA, etc. I know the EDB folks at > least looked into it though, and presumably there were good reasons to go in > this direction. Maybe that was just "community will never accept threaded > conversion" at the time, though. Yep. Never is a long time, but it took 3 release cycles to get a user-visible feature as it was, and if I'd tried to insist on a process->thread conversion first I suspect we'd still be stuck on that point today. Perhaps we would have gotten as far as getting that much done, but that wouldn't make parallel query be done on top of it. > Now we have quite a lot of homebrew infrastructure to consider if we do a > conversion. > > That said, it might in some ways make it easier. shm_mq, for example, would > likely convert to a threaded backend with minimal changes to callers, and > probably only limited changes to shm_mq its self. So maybe these > abstractions will prove to have been a win in some ways. Except DSA, and > even then it could serve as a transitional API... Yeah, I don't feel too bad about what we've built. Even if it ultimately goes away, it will have served the useful purpose of proving that parallel query is a good idea and can work. Besides, shm_mq is just a ring buffer for messages; that's not automatically something that we don't want just because we move to threads. If it goes away, which I think not unlikely, it'll be because something else is faster. Also, it's not as if only parallel query structures might have been designed differently if we had been using threads all along. dynahash, for example, is quite unlike most concurrent hash tables and a big part of the reason is that it has to cope with being situated in a fixed-size chunk of shared memory. More generally, the whole reason there's no cheap, straightforward palloc_shared() is the result of the current design, and it seems very unlikely we wouldn't have that quite apart from parallel query. Install pg_stat_statements without a server restart? Yes, please. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Postgres with pthread
On 7 December 2017 at 19:55, Konstantin Knizhnikwrote: > > Pros: > 1. Simplified memory model: no need in DSM, shm_mq, DSA, etc > shm_mq would remain useful, and the others could only be dropped if you also dropped process-model support entirely. > 1. Breaks compatibility with existed extensions and adds more requirements > for authors of new extension > Depends on how much frightening preprocessor magic you're willing to use, doesn't it? ;) Wouldn't be surprised if simple extensions (C functions etc) stayed fairly happy, but it'd be hazardous enough in terms of library use etc that deliberate breakage may be beter. > 2. Problems with integration of single-threaded PLs: Python, Lua,... > Yeah, that's going to hurt. Especially since most non-plpgsql code out there will be plperl and plpython. Breaking that's not going to be an option, but nobody's going to be happy if all postgres backends must contend for the same Python GIL. Plus it'd be deadlock-city. That's nearly a showstopper right there. Especially since with a quick look around it looks like the cPython GIL is per-DLL (at least on Windows) not per-interpreter-state, so spawning separate interpreter states per-thread may not be sufficient. That makes sense given that cPython its self is thread-aware; otherwise it'd have a really hard time figuring out which GIL and interpreter state to look at when in a cPython-spawned thread. > 3. Worser protection from programming errors, included errors in > extensions. > Mainly contaminating memory of unrelated procesess, or the postmaster. I'm not worried about outright crashes. On any modern system it's not significantly worse to take down the postmaster than it is to have it do its own recovery. A modern init will restart it promptly. (If you're not running postgres under an init daemon for production then... well, you should be.) > 4. Lack of explicit separation of shared and privite memory leads to more > synchronization errors. > Accidentally clobbering postmaster memory/state would be my main worry there. Right now we gain a lot of protection from our copy-on-write shared-nothing-by-default model, and we rely on it in quite a lot of places where backends merrily stomp on inherited postmaster state. The more I think about it, the less enthusiastic I am, really. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Postgres with pthread
On 07.12.2017 00:58, Thomas Munro wrote: Using a ton of thread local variables may be a useful stepping stone, but if we want to be able to separate threads/processes from sessions eventually then I guess we'll want to model sessions as first class objects and pass them around explicitly or using a single TLS variable current_session. It was my primary intention. Unfortunately separating all static variables into some kind of session context requires much more efforts: we have to change all accesses to such variables. But please notice, that from performance point of view access to __thread variables is not more expensive then access to static variable or access to fields of session context structure through current_session. And there is no extra space overhead for them. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Postgres with pthread
Hi On 06.12.2017 20:08, Andres Freund wrote: 4. Rewrite file descriptor cache to be global (shared by all threads). That one I'm very unconvinced of, that's going to add a ton of new contention. Do you mean lock contention because of mutex I used to synchronize access to shared file descriptor cache or contention for file descriptors? Right now each thread has its own virtual file descriptors, so them are not shared between threads. But there is common LRU, restricting total number of opened descriptors in the process. Actually I have not other choice if I want to support thousands of connection. If each thread has its own private descriptor cache (as it is now for processes) and its size is estimated base on open file quota, then there will be millions of opened file descriptors. Concerning contention for mutex, I do not think that it is a problem. At least I have to say that performance (with 100 connections) is significantly improved and shows almost the same speed as for 10 connections after I have rewritten file descriptor can and made it global (my original implementation just made all fd.c static variables as thread local, so each thread has its separate pool). It is possible to go further and shared file descriptors between threads and use pwrite/pread instead of seek+read/write. But we still need mutex to implement LRU l2list and free handler list.
Re: Postgres with pthread
> But it is a theory. The main idea of this prototype was to prove or disprove > this expectation at practice. > But please notice that it is very raw prototype. A lot of stuff is not > working yet. > And supporting all of exited Postgres functionality requires > much more efforts (and even more efforts are needed for optimizing Postgres > for this architecture). > > I just want to receive some feedback and know if community is interested in > any further work in this direction. Looks good. You are right, it is a theory. If your prototype does actually show what we think it does then it is a good and interesting result. I think we need careful analysis to show where these exact gains come from. The actual benefit is likely not evenly distributed across the list of possible benefits. Did they arise because you produced a stripped down version of Postgres? Or did they arise from using threads? It would not be the first time a result shown in protoype did not show real gains on a completed project. I might also read your results to show that connection concentrators would be a better area of work, since 100 connections perform better than 1000 in both cases, so why bother optimising for 1000 connections at all? In which case we should read the benefit at the 100 connections line, where it shows the lower 28% gain, closer to the gain your colleague reported. So I think we don't yet have enough to make a decision. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Postgres with pthread
On 7 December 2017 at 11:44, Tsunakawa, Takayuki < tsunakawa.ta...@jp.fujitsu.com> wrote: > From: Craig Ringer [mailto:cr...@2ndquadrant.com] > > I'd personally expect that an immediate conversion would result > > in very > > little speedup, a bunch of code deleted, a bunch of complexity > > added. And it'd still be massively worthwhile, to keep medium to > > long > > term complexity and feature viability in control. > > +1 > I hope for things like: > > * More performance statistics like system-wide LWLock waits, without the > concern about fixed shared memory size > * Dynamic memory sizing, such as shared_buffers, work_mem, > maintenance_work_mem > I'm not sure how threaded operations would help us much there. If we could split shared_buffers into extents we could do this with something like dsm already. Without the ability to split it into extents, we can't do it with locally malloc'd memory in a threaded system either. Re performance diagnostics though, you can already get a lot of useful data from PostgreSQL's SDT tracepoints, which are usable with perf and DTrace amongst other tools. Dynamic userspace 'perf' probes can tell you a lot too. I'm confident you could collect some seriously useful data with perf tracepoints and 'perf script' these days. (BTW, I extended the https://wiki.postgresql.org/wiki/Profiling_with_perf article a bit yesterday with some tips on this). Of course better built-in diagnostics would be nice. But I really don't see how it'd have much to do with threaded vs forked model of execution; we can allocate chunks of memory with dsm now, after all. > * Running multi-threaded components in postgres extension (is it really > safe to run JVM for PL/Java in a single-threaded postgres?) > PL/Java is a giant mess for so many more reasons than that. The JVM is a heavyweight startup, lightweight thread model system. It doesn't play at all well with postgres's lightweight process fork()-based CoW model. You can't fork() the JVM because fork() doesn't play nice with threads, at all. So you have to start it in each backend individually, which is just awful. One of the nice things if Pg got a threaded model would be that you could embed a JVM, Mono/.NET runtime, etc and have your sessions work together in ways you cannot currently sensibly do. Folks using MS SQL, Oracle, etc are pretty used to being able to do this, and while it should be done with caution it can offer huge benefits for some complex workloads. Right now if a PostgreSQL user wants to do anything involving IPC, shared data, etc, we pretty much have to write quite complex C extensions to do it. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Postgres with pthread
On 7 December 2017 at 05:58, Thomas Munrowrote: > > Using a ton of thread local variables may be a useful stepping stone, > but if we want to be able to separate threads/processes from sessions > eventually then I guess we'll want to model sessions as first class > objects and pass them around explicitly or using a single TLS variable > current_session. > > Yep. This is the real reason I'm excited by the idea of a threading conversion. PostgreSQL's architecture conflates "connection", "session" and "executor" into one somewhat muddled mess. I'd love to be able to untangle that to the point where we can pool executors amongst active queries, while retaining idle sessions' state properly even while they're in a transaction. Yeah, that's a long way off, but it'd be a whole lot more practical if we didn't have to serialize and deserialize the entire session state to do it. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Postgres with pthread
On 7 December 2017 at 01:17, Andres Freundwrote: > > > I think you've done us a very substantial service by pursuing > > this far enough to get some quantifiable performance results. > > But now that we have some results in hand, I think we're best > > off sticking with the architecture we've got. > > I don't agree. > > I'd personally expect that an immediate conversion would result in very > little speedup, a bunch of code deleted, a bunch of complexity > added. And it'd still be massively worthwhile, to keep medium to long > term complexity and feature viability in control. > Personally I think it's a pity we didn't land up here before the foundations for parallel query went in - DSM, shm_mq, DSA, etc. I know the EDB folks at least looked into it though, and presumably there were good reasons to go in this direction. Maybe that was just "community will never accept threaded conversion" at the time, though. Now we have quite a lot of homebrew infrastructure to consider if we do a conversion. That said, it might in some ways make it easier. shm_mq, for example, would likely convert to a threaded backend with minimal changes to callers, and probably only limited changes to shm_mq its self. So maybe these abstractions will prove to have been a win in some ways. Except DSA, and even then it could serve as a transitional API... -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Postgres with pthread
On Thu, Dec 7, 2017 at 6:08 AM, Andres Freundwrote: > On 2017-12-06 19:40:00 +0300, Konstantin Knizhnik wrote: >> As far as I remember, several years ago when implementation of intra-query >> parallelism was just started there was discussion whether to use threads or >> leave traditional Postgres process architecture. The decision was made to >> leave processes. So now we have bgworkers, shared message queue, DSM, ... >> The main argument for such decision was that switching to threads will >> require rewriting of most of Postgres code. > >> It seems to be quit reasonable argument and and until now I agreed with it. >> >> But recently I wanted to check it myself. > > I think that's something pretty important to play with. There've been > several discussions lately, both on and off list / in person, that we're > taking on more-and-more technical debt just because we're using > processes. Besides the above, we've grown: > - a shared memory allocator > - a shared memory hashtable > - weird looking thread aware pointers > - significant added complexity in various projects due to addresses not > being mapped to the same address etc. Yes, those are all workarounds for an ancient temporary design choice. To quote from a 1989 paper[1] "Currently, POSTGRES runs as one process for each active user. This was done as an expedient to get a system operational as quickly as possible. We plan on converting POSTGRES to use lightweight processes [...]". +1 for sticking to the plan. While personally contributing to the technical debt items listed above, I always imagined that all that machinery could become compile-time options controlled with --with-threads and dsa_get_address() would melt away leaving only a raw pointers, and dsa_area would forward to the MemoryContext + ResourceOwner APIs, or something like that. It's unfortunate that we lose type safety along the way though. (If only there were some way we could write dsa_pointer. In fact it was also a goal of the original project to adopt C++, based on a comment in 4.2's nodes.h: "Eventually this code should be transmogrified into C++ classes, and this is more or less compatible with those things.") If there were a good way to reserve (but not map) a large address range before forking, there could also be an intermediate build mode that keeps the multi-process model but where DSA behaves as above, which might an interesting way to decouple the DSA-go-faster-and-reduce-tech-debt project from the threading project. We could manage the reserved address space ourselves and map DSM segments with MAP_FIXED, so dsa_get_address() address decoding could be compiled away. One way would be to mmap a huge range backed with /dev/zero, and then map-with-MAP_FIXED segments over the top of it and then remap /dev/zero back into place when finished, but that sucks because it gives you that whole mapping in your core files and relies on overcommit which we don't like, hence my interest in a way to reserve but not map. >> The first problem with porting Postgres to pthreads is static variables >> widely used in Postgres code. >> Most of modern compilers support thread local variables, for example GCC >> provides __thread keyword. >> Such variables are placed in separate segment which is address through >> segment register (at Intel). >> So access time to such variables is the same as to normal static variables. > > I experimented similarly. Although I'm not 100% sure that if were to go > for it, we wouldn't instead want to abstract our session concept > further, or well, at all. Using a ton of thread local variables may be a useful stepping stone, but if we want to be able to separate threads/processes from sessions eventually then I guess we'll want to model sessions as first class objects and pass them around explicitly or using a single TLS variable current_session. > I think the biggest problem with doing this for real is that it's a huge > project, and that it'll take a long time. > > Thanks for working on this! +1 [1] http://db.cs.berkeley.edu/papers/ERL-M90-34.pdf -- Thomas Munro http://www.enterprisedb.com
Re: Postgres with pthread
Hi, On 2017-12-06 12:28:29 -0500, Robert Haas wrote: > > Possibly we even want to continue having various > > processes around besides that, the most interesting cases involving > > threads are around intra-query parallelism, and pooling, and for both a > > hybrid model could be beneficial. > > I think if we only use threads for intra-query parallelism we're > leaving a lot of money on the table. For example, if all > shmem-connected backends are using the same process, then we can make > max_locks_per_transaction PGC_SIGHUP. That would be sweet, and there > are probably plenty of similar things. Moreover, if threads are this > thing that we only use now and then for parallel query, then our > support for them will probably have bugs. If we use them all the > time, we'll actually find the bugs and fix them. I hope. I think it'd make a lot of sense to go there gradually. I agree that we probably want to move to more and more use of threads, but we also want our users not to kill us ;). Initially we'd surely continue to use partitioned dynahash for locks, which'd make resizing infeasible anyway. Similar for shared buffers (which I find a hell of a lot more interesting to change at runtime than max_locks_per_transaction), etc... - Andres
Re: Postgres with pthread
On 12/06/2017 06:08 PM, Andres Freund wrote: I think the biggest problem with doing this for real is that it's a huge project, and that it'll take a long time. An additional issue is that this could break a lot of extensions and in a way that it is not apparent at compile time. This means we may need to break all extensions to force extensions authors to check if they are thread safe. I do not like making life hard for out extension community, but if the gains are big enough it might be worth it. Thanks for working on this! Seconded. Andreas
Re: Postgres with pthread
> "barely a 50% speedup" - Hah. I don't believe the numbers, but that'd be > huge. They are numbers derived from a benchmark that any sane person would be using a connection pool for in a production environment, but impressive if true none the less.
Re: Postgres with pthread
Hi, On 2017-12-06 11:53:21 -0500, Tom Lane wrote: > Konstantin Knizhnikwrites: > However, if I guess at which numbers are supposed to be what, > it looks like even the best case is barely a 50% speedup. "barely a 50% speedup" - Hah. I don't believe the numbers, but that'd be huge. > That would be worth pursuing if it were reasonably low-hanging > fruit, but converting PG to threads seems very far from being that. I don't think immediate performance gains are the interesting part about using threads. It's rather what their absence adds a lot in existing / submitted code complexity, and makes some very commonly requested features a lot harder to implement: - we've a lot of duplicated infrastructure around dynamic shared memory. dsm.c dsa.c, dshash.c etc. A lot of these, especially dsa.c, are going to become a lot more complicated over time, just look at how complicated good multi threaded allocators are. - we're adding a lot of slowness to parallelism, just because we have different memory layouts in different processes. Instead of just passing pointers through queues, we put entire tuples in there. We deal with dsm aware pointers. - a lot of features have been a lot harder (parallelism!), and a lot of frequently requested ones are so hard due to processes that they never got off ground (in-core pooling, process reuse, parallel worker reuse) - due to the statically sized shared memory a lot of our configuration is pretty fundamentally PGC_POSTMASTER, even though that present a lot of administrative problems. ... > I think you've done us a very substantial service by pursuing > this far enough to get some quantifiable performance results. > But now that we have some results in hand, I think we're best > off sticking with the architecture we've got. I don't agree. I'd personally expect that an immediate conversion would result in very little speedup, a bunch of code deleted, a bunch of complexity added. And it'd still be massively worthwhile, to keep medium to long term complexity and feature viability in control. Greetings, Andres Freund
Re: Postgres with pthread
On Wed, Dec 6, 2017 at 11:53 AM, Tom Lanewrote: > barely a 50% speedup. I think that's an awfully strange choice of adverb. This is, by its authors own admission, a rough cut at this, probably with very little of the optimization that could ultimately done, and it's already buying 50% on some test cases? That sounds phenomenally good to me. A 50% speedup is huge, and chances are that it can be made quite a bit better with more work, or that it already is quite a bit better with the right test case. TBH, based on previous discussion, I expected this to initially be *slower* but still worthwhile in the long run because of optimizations that it would let us do eventually with parallel query and other things. If it's this much faster out of the gate, that's really exciting. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Postgres with pthread
Hi! On 2017-12-06 19:40:00 +0300, Konstantin Knizhnik wrote: > As far as I remember, several years ago when implementation of intra-query > parallelism was just started there was discussion whether to use threads or > leave traditional Postgres process architecture. The decision was made to > leave processes. So now we have bgworkers, shared message queue, DSM, ... > The main argument for such decision was that switching to threads will > require rewriting of most of Postgres code. > It seems to be quit reasonable argument and and until now I agreed with it. > > But recently I wanted to check it myself. I think that's something pretty important to play with. There've been several discussions lately, both on and off list / in person, that we're taking on more-and-more technical debt just because we're using processes. Besides the above, we've grown: - a shared memory allocator - a shared memory hashtable - weird looking thread aware pointers - significant added complexity in various projects due to addresses not being mapped to the same address etc. > The first problem with porting Postgres to pthreads is static variables > widely used in Postgres code. > Most of modern compilers support thread local variables, for example GCC > provides __thread keyword. > Such variables are placed in separate segment which is address through > segment register (at Intel). > So access time to such variables is the same as to normal static variables. I experimented similarly. Although I'm not 100% sure that if were to go for it, we wouldn't instead want to abstract our session concept further, or well, at all. > Certainly may be not all compilers have builtin support of TLS and may be > not at all hardware platforms them are implemented ias efficiently as at > Intel. > So certainly such approach decreases portability of Postgres. But IMHO it is > not so critical. I'd agree there, but I don't think the project necessarily does. > What I have done: > 1. Add session_local (defined as __thread) to definition of most of static > and global variables. > I leaved some variables pointed to shared memory as static. Also I have to > changed initialization of some static variables, > because address of TLS variable can not be used in static initializers. > 2. Change implementation of GUCs to make them thread specific. > 3. Replace fork() with pthread_create > 4. Rewrite file descriptor cache to be global (shared by all threads). That one I'm very unconvinced of, that's going to add a ton of new contention. > What are the advantages of using threads instead of processes? > > 1. No need to use shared memory. So there is no static limit for amount of > memory which can be used by Postgres. No need in distributed shared memory > and other stuff designed to share memory between backends and > bgworkers. This imo is the biggest part. We can stop duplicating OS and our own implementations in a shmem aware way. > 2. Threads significantly simplify implementation of parallel algorithms: > interaction and transferring data between threads can be done easily and > more efficiently. That's imo the same as 1. > 3. It is possible to use more efficient/lightweight synchronization > primitives. Postgres now mostly relies on its own low level sync.primitives > which user-level implementation > is using spinlocks and atomics and then fallback to OS semaphores/poll. I am > not sure how much gain can we get by replacing this primitives with one > optimized for threads. > My colleague from Firebird community told me that just replacing processes > with threads can obtain 20% increase of performance, but it is just first > step and replacing sync. primitive > can give much greater advantage. But may be for Postgres with its low level > primitives it is not true. I don't believe that that's actually the case to any significant degree. > 6. Faster backend startup. Certainly starting backend at each user's request > is bad thing in any case. Some kind of connection pooling should be used in > any case to provide acceptable performance. But in any case, start of new > backend process in postgres causes a lot of page faults which have > dramatical impact on performance. And there is no such problem with threads. I don't buy this in itself. The connection establishment overhead isn't largely the fork, it's all the work afterwards. I do think it makes connection pooling etc easier. > I just want to receive some feedback and know if community is interested in > any further work in this direction. I personally am. I think it's beyond high time that we move to take advantage of threads. That said, I don't think just replacing threads is the right thing. I'm pretty sure we'd still want to have postmaster as a separate process, for robustness. Possibly we even want to continue having various processes around besides that, the most interesting cases involving threads are around intra-query parallelism, and pooling, and for both a hybrid
Re: Postgres with pthread
Here it is formatted a little better. So a little over 50% performance improvement for a couple of the test cases. On Wed, Dec 6, 2017 at 11:53 AM, Tom Lanewrote: > Konstantin Knizhnik writes: > > Below are some results (1000xTPS) of select-only (-S) pgbench with scale > > 100 at my desktop with quad-core i7-4770 3.40GHz and 16Gb of RAM: > > > ConnectionsVanilla/default Vanilla/prepared > > pthreads/defaultpthreads/prepared > > 10100 191 > > 106 207 > > 100 67 131 > > 105 168 > > 100041 65 > > 55 102 > > This table is so mangled that I'm not very sure what it's saying. > Maybe you should have made it an attachment? > > However, if I guess at which numbers are supposed to be what, > it looks like even the best case is barely a 50% speedup. > That would be worth pursuing if it were reasonably low-hanging > fruit, but converting PG to threads seems very far from being that. > > I think you've done us a very substantial service by pursuing > this far enough to get some quantifiable performance results. > But now that we have some results in hand, I think we're best > off sticking with the architecture we've got. > > regards, tom lane > >