Re: Postgres with pthread

2017-12-27 Thread Konstantin Knizhnik



On 27.12.2017 13:08, Andres Freund wrote:


On December 27, 2017 11:05:52 AM GMT+01:00, james 
 wrote:

All threads are blocked in semaphores.

That they are blocked is inevitable - I guess the issue is that they
are
thrashing.
I guess it would be necessary to separate the internals to have some
internal queueing and effectively reduce the number of actively
executing threads.
In effect make the connection pooling work internally.

Would it be possible to make the caches have persistent (functional)
data structures - effectively CoW?

And how easy would it be to abort if the master view had subsequently
changed when it comes to execution?

Optimizing for this seems like a pointless exercise. If the goal is efficient 
processing of 100k connections the solution is a session / connection 
abstraction and a scheduler.   Optimizing for this amount of concurrency just 
will add complexity and slowdowns for a workload that nobody will run.
I agree with you that supporting 100k active connections has not so much 
practical sense now.
But there are many systems with hundreds of cores and to utilize them we 
still need spawn thousands of backends.

In this case Postgres snaphots and local caches becomes inefficient.
Switching to CSN allows to somehow solve the problem with snapshots.
But the problems with private caches should also be addressed: it seems 
to be very stupid to perform the same work 1000x times and maintain 
1000x copies.
Also, in case of global prepared statements, presence of global cache 
allows to spend more time in plan optimization use manual tuning.


Switching to pthreads model significantly simplify  development of 
shared caches: there are no problems with statically allocated shared 
address space or dynamic segments mapped on different address, not 
allowing to use normal pointer. Also invalidation of shared cache is 
easier: on need to send invalidation notifications to all backends.
But still it requires a lot of work. For example catalog cache is 
tightly integrated with resource owner's information.
Also shared cache requires synchronization and this synchronization 
itself can become a bottleneck.



Andres


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Postgres with pthread

2017-12-27 Thread Andres Freund


On December 27, 2017 11:05:52 AM GMT+01:00, james 
 wrote:
> > All threads are blocked in semaphores.
>That they are blocked is inevitable - I guess the issue is that they
>are 
>thrashing.
>I guess it would be necessary to separate the internals to have some 
>internal queueing and effectively reduce the number of actively 
>executing threads.
>In effect make the connection pooling work internally.
>
>Would it be possible to make the caches have persistent (functional) 
>data structures - effectively CoW?
>
>And how easy would it be to abort if the master view had subsequently 
>changed when it comes to execution?

Optimizing for this seems like a pointless exercise. If the goal is efficient 
processing of 100k connections the solution is a session / connection 
abstraction and a scheduler.   Optimizing for this amount of concurrency just 
will add complexity and slowdowns for a workload that nobody will run.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Postgres with pthread

2017-12-27 Thread james

> All threads are blocked in semaphores.
That they are blocked is inevitable - I guess the issue is that they are 
thrashing.
I guess it would be necessary to separate the internals to have some 
internal queueing and effectively reduce the number of actively 
executing threads.

In effect make the connection pooling work internally.

Would it be possible to make the caches have persistent (functional) 
data structures - effectively CoW?


And how easy would it be to abort if the master view had subsequently 
changed when it comes to execution?





Re: Postgres with pthread

2017-12-27 Thread Konstantin Knizhnik



On 21.12.2017 16:25, Konstantin Knizhnik wrote:

I continue experiments with my pthread prototype.
Latest results are the following:

1. I have eliminated all (I hope) calls of non-reentrant functions 
(getopt, setlocale, setitimer, localtime, ...). So now parallel tests 
are passed.


2. I have implemented deallocation of top memory context (at thread 
exit) and cleanup of all opened file descriptors.
I have to replace several place where malloc is used with top_malloc: 
allocation in top context.


3. Now my prototype is passing all regression tests now. But handling 
of errors is still far from completion.


4. I have performed experiments with replacing synchronization 
primitives used in Postgres with pthread analogues.

Unfortunately it has almost now influence on performance.

5. Handling large number of connections.
The maximal number of postgres connections is almost the same: 100k.
But memory footprint in case of pthreads was significantly smaller: 
18Gb vs 38Gb.

And difference in performance was much higher: 60k TPS vs . 600k TPS.
Compare it with performance for 10k clients: 1300k TPS.
It is read-only pgbench -S test with 1000 connections.
As far as pgbench doesn't allow to specify more than 1000 clients, I 
spawned several instances of pgbench.


Why handling large number of connections is important?
It allows applications to access postgres directly, not using 
pgbouncer or any other external connection pooling tool.
In this case an application can use prepared statements which can 
reduce speed of simple queries almost twice.


Unfortunately Postgres sessions are not lightweight. Each backend 
maintains its private catalog and relation caches, prepared statement 
cache,...
For real database size of this caches in memory will be several 
megabytes and warming this caches can take significant amount of time.
So if we really want to support large number of connections, we should 
rewrite caches to be global (shared).
It will allow to save a lot of memory but add synchronization 
overhead. Also at NUMA private caches may be more efficient than one 
global cache.


My proptotype can be found at: 
git://github.com/postgrespro/postgresql.pthreads.git




Finally I managed to run Postgres with 100k active connections.
Not sure that this result can pretend for been mentioned in Guiness 
records, but I am almost sure that nobody has done it before (at least 
with original version of Postgres).
But it was really "Pyrrhic victory". Performance for 100k connections is 
1000 times slower than for 10k. All threads are blocked in semaphores.
This is more or less expected result, but still scale of degradation is 
impressive:



#Connections
TPS
100k
550
10k
558k
6k
745k
4k
882k
2k
1100k
1k
1300k



As it is clear from this stacktraces, shared catalog and statement cache 
are highly needed to provide good performance with such large number of 
active backends:



(gdb) thread apply all bt

Thread 17807 (LWP 660863):
#0  0x7f4c1cb46576 in do_futex_wait.constprop () from 
/lib64/libpthread.so.0
#1  0x7f4c1cb46668 in __new_sem_wait_slow.constprop.0 () from 
/lib64/libpthread.so.0

#2  0x00697a32 in PGSemaphoreLock ()
#3  0x00702a64 in LWLockAcquire ()
#4  0x006fbf2d in LockAcquireExtended ()
#5  0x006f9fa3 in LockRelationOid ()
#6  0x004b2ffd in relation_open ()
#7  0x004b31d6 in heap_open ()
#8  0x007f1ed1 in CatalogCacheInitializeCache ()
#9  0x007f3835 in SearchCatCache1 ()
#10 0x00800510 in get_tablespace ()
#11 0x008006e1 in get_tablespace_page_costs ()
#12 0x0065a4e1 in cost_seqscan ()
#13 0x0068bf92 in create_seqscan_path ()
#14 0x006568b4 in set_rel_pathlist ()
#15 0x00656eb8 in make_one_rel ()
#16 0x006740d0 in query_planner ()
#17 0x00676526 in grouping_planner ()
#18 0x00679812 in subquery_planner ()
#19 0x0067a66c in standard_planner ()
#20 0x0070ffe1 in pg_plan_query ()
#21 0x007100b6 in pg_plan_queries ()
#22 0x007f6c6f in BuildCachedPlan ()
#23 0x007f6e5c in GetCachedPlan ()
#24 0x00711ccf in PostgresMain ()
#25 0x006a5535 in backend_main_proc ()
#26 0x006a353d in thread_trampoline ()
#27 0x7f4c1cb3d36d in start_thread () from /lib64/libpthread.so.0
#28 0x7f4c1c153b8f in clone () from /lib64/libc.so.6

Thread 17806 (LWP 660861):
#0  0x7f4c1cb46576 in do_futex_wait.constprop () from 
/lib64/libpthread.so.0
#1  0x7f4c1cb46668 in __new_sem_wait_slow.constprop.0 () from 
/lib64/libpthread.so.0

#2  0x00697a32 in PGSemaphoreLock ()
#3  0x00702a64 in LWLockAcquire ()
#4  0x006fbf2d in LockAcquireExtended ()
#5  0x006f9fa3 in LockRelationOid ()
#6  0x004b2ffd in relation_open ()
#7  0x004b31d6 in heap_open ()
#8  0x007f1ed1 in CatalogCacheInitializeCache ()
#9  

Re: Postgres with pthread

2017-12-21 Thread Pavel Stehule
2017-12-21 14:25 GMT+01:00 Konstantin Knizhnik :

> I continue experiments with my pthread prototype.
> Latest results are the following:
>
> 1. I have eliminated all (I hope) calls of non-reentrant functions
> (getopt, setlocale, setitimer, localtime, ...). So now parallel tests are
> passed.
>
> 2. I have implemented deallocation of top memory context (at thread exit)
> and cleanup of all opened file descriptors.
> I have to replace several place where malloc is used with top_malloc:
> allocation in top context.
>
> 3. Now my prototype is passing all regression tests now. But handling of
> errors is still far from completion.
>
> 4. I have performed experiments with replacing synchronization primitives
> used in Postgres with pthread analogues.
> Unfortunately it has almost now influence on performance.
>
> 5. Handling large number of connections.
> The maximal number of postgres connections is almost the same: 100k.
> But memory footprint in case of pthreads was significantly smaller: 18Gb
> vs 38Gb.
> And difference in performance was much higher: 60k TPS vs . 600k TPS.
> Compare it with performance for 10k clients: 1300k TPS.
> It is read-only pgbench -S test with 1000 connections.
> As far as pgbench doesn't allow to specify more than 1000 clients, I
> spawned several instances of pgbench.
>
> Why handling large number of connections is important?
> It allows applications to access postgres directly, not using pgbouncer or
> any other external connection pooling tool.
> In this case an application can use prepared statements which can reduce
> speed of simple queries almost twice.
>

What I know MySQL has not good experience with high number of threads - and
there is thread pool in enterprise (and now in Mariadb0 versions.

Regards

Pavel


> Unfortunately Postgres sessions are not lightweight. Each backend
> maintains its private catalog and relation caches, prepared statement
> cache,...
> For real database size of this caches in memory will be several megabytes
> and warming this caches can take significant amount of time.
> So if we really want to support large number of connections, we should
> rewrite caches to be global (shared).
> It will allow to save a lot of memory but add synchronization overhead.
> Also at NUMA private caches may be more efficient than one global cache.
>
> My proptotype can be found at: git://github.com/postgrespro/p
> ostgresql.pthreads.git
>
>
> --
>
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>


Re: Postgres with pthread

2017-12-21 Thread Konstantin Knizhnik

I continue experiments with my pthread prototype.
Latest results are the following:

1. I have eliminated all (I hope) calls of non-reentrant functions 
(getopt, setlocale, setitimer, localtime, ...). So now parallel tests 
are passed.


2. I have implemented deallocation of top memory context (at thread 
exit) and cleanup of all opened file descriptors.
I have to replace several place where malloc is used with top_malloc: 
allocation in top context.


3. Now my prototype is passing all regression tests now. But handling of 
errors is still far from completion.


4. I have performed experiments with replacing synchronization 
primitives used in Postgres with pthread analogues.

Unfortunately it has almost now influence on performance.

5. Handling large number of connections.
The maximal number of postgres connections is almost the same: 100k.
But memory footprint in case of pthreads was significantly smaller: 18Gb 
vs 38Gb.

And difference in performance was much higher: 60k TPS vs . 600k TPS.
Compare it with performance for 10k clients: 1300k TPS.
It is read-only pgbench -S test with 1000 connections.
As far as pgbench doesn't allow to specify more than 1000 clients, I 
spawned several instances of pgbench.


Why handling large number of connections is important?
It allows applications to access postgres directly, not using pgbouncer 
or any other external connection pooling tool.
In this case an application can use prepared statements which can reduce 
speed of simple queries almost twice.


Unfortunately Postgres sessions are not lightweight. Each backend 
maintains its private catalog and relation caches, prepared statement 
cache,...
For real database size of this caches in memory will be several 
megabytes and warming this caches can take significant amount of time.
So if we really want to support large number of connections, we should 
rewrite caches to be global (shared).
It will allow to save a lot of memory but add synchronization overhead. 
Also at NUMA private caches may be more efficient than one global cache.


My proptotype can be found at: 
git://github.com/postgrespro/postgresql.pthreads.git



--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Postgres with pthread

2017-12-10 Thread james

On 06/12/2017 17:26, Andreas Karlsson wrote:
An additional issue is that this could break a lot of extensions and 
in a way that it is not apparent at compile time. This means we may 
need to break all extensions to force extensions authors to check if 
they are thread safe.


I do not like making life hard for out extension community, but if the 
gains are big enough it might be worth it.


It seems to me that the counter-argument is that extensions that 
naturally support threading will benefit.  For example it may be a lot 
more practical to have CLR or JVM extensions.






Re: Postgres with pthread

2017-12-08 Thread Alexander Korotkov
On Sat, Dec 9, 2017 at 1:09 AM, konstantin knizhnik <
k.knizh...@postgrespro.ru> wrote:

> I am not going to show stack traces of all 1000 threads.
> But you may notice that proc array lock really seems be be a bottleneck.
>

Yes, proc array lock easily becomes a bottleneck on multicore machine with
large number of connections.  Related to this, another patch helping to
large number of connections is CSN.  When our snapshot model was invented,
xip was just array of few elements, and that cause no problem.  Now, we're
considering threads to help us handling thousands of connections.  Snapshot
with thousands of xips looks ridiculous.  Collecting such a large snapshot
could be more expensive than single index lookup.

These two patches threads and CSN are both complicated and require hard
work during multiple release cycles to get committed.  But I really hope
that their cumulative effect can dramatically improve situation on high
number of connections.  There are already some promising benchmarks in CSN
thread.  I wonder if we already can do some cumulative benchmarks of
threads + CSN?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Postgres with pthread

2017-12-08 Thread konstantin knizhnik

On Dec 7, 2017, at 10:41 AM, Simon Riggs wrote:

>> But it is a theory. The main idea of this prototype was to prove or disprove
>> this expectation at practice.
> 
>> But please notice that it is very raw prototype. A lot of stuff is not
>> working yet.
> 
>> And supporting all of exited Postgres functionality requires
>> much more efforts (and even more efforts are needed for optimizing Postgres
>> for this architecture).
>> 
>> I just want to receive some feedback and know if community is interested in
>> any further work in this direction.
> 
> Looks good. You are right, it is a theory. If your prototype does
> actually show what we think it does then it is a good and interesting
> result.
> 
> I think we need careful analysis to show where these exact gains come
> from. The actual benefit is likely not evenly distributed across the
> list of possible benefits. Did they arise because you produced a
> stripped down version of Postgres? Or did they arise from using
> threads?
> 
> It would not be the first time a result shown in protoype did not show
> real gains on a completed project.
> 
> I might also read your results to show that connection concentrators
> would be a better area of work, since 100 connections perform better
> than 1000 in both cases, so why bother optimising for 1000 connections
> at all? In which case we should read the benefit at the 100
> connections line, where it shows the lower 28% gain, closer to the
> gain your colleague reported.
> 
> So I think we don't yet have enough to make a decision.


Concerning optimal number of connection: one of my intentions was to eliminate 
meed in external connection pool (pgbouncer).
In this case applications can use prepared statements which itself provides two 
times increase of performance.
I  believe that threads have smaller footprint than processes, to it is 
possible to spawn more threads and directly access them without intermediate 
layer with connection pooling.


I have performed experiments at more power server: 
144 virtual cores Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz.

Here results of read-only queries are different: both pthreads and vanilla 
version shows almost the same speed both for 100 and 1000 connections: about 
1300k TPS
with prepared statement. So there is no performance degradation with increased 
number of connections and no larger difference between processes and threads.

But at read-write workload (pgbench -N) there is still significant advantage of 
pthreads version (kTPS):


Connections
Vanilla
pthreads
100
165
154
1000
85
118


For some reasons (which I do not know yet) multiprocess version of postgres is 
slightly faster for 100 connections, 
but degrades almost twice for 1000 connections, while degradation of 
multithreads version is not so large.

By the way, pthreads version make it possible to much easily check whats going 
on using gdb (manual "profiling") :


thread apply all bt
Thread 997 (Thread 0x7f6e08810700 (LWP 61345)):
#0  0x7f7e03263576 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x7f7e03263668 in __new_sem_wait_slow.constprop.0 () from 
/lib64/libpthread.so.0
#2  0x00698552 in PGSemaphoreLock ()
#3  0x00702804 in LWLockAcquire ()
#4  0x004f9ac4 in XLogInsertRecord ()
#5  0x00503b97 in XLogInsert ()
#6  0x004bb0d1 in log_heap_clean ()
#7  0x004bd7c8 in heap_page_prune ()
#8  0x004bd9c1 in heap_page_prune_opt ()
---Type  to continue, or q  to quit---
#9  0x004c43d4 in index_fetch_heap ()
#10 0x004c4410 in index_getnext ()
#11 0x006037d2 in IndexNext ()
#12 0x005f3a80 in ExecScan ()
#13 0x00609eba in ExecModifyTable ()
#14 0x005ed6fa in standard_ExecutorRun ()
#15 0x00713622 in ProcessQuery ()
#16 0x00713885 in PortalRunMulti ()
#17 0x007143a5 in PortalRun ()
#18 0x00711cf1 in PostgresMain ()
#19 0x006a708b in backend_main_proc ()
#20 0x7f7e0325a36d in start_thread () from /lib64/libpthread.so.0
#21 0x7f7e02870b8f in clone () from /lib64/libc.so.6

Thread 996 (Thread 0x7f6e08891700 (LWP 61344)):
#0  0x7f7e03263576 in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x7f7e03263668 in __new_sem_wait_slow.constprop.0 () from 
/lib64/libpthread.so.0
#2  0x00698552 in PGSemaphoreLock ()
#3  0x00702804 in LWLockAcquire ()
#4  0x004bc862 in RelationGetBufferForTuple ()
#5  0x004b60db in heap_insert ()
#6  0x0060ad3b in ExecModifyTable ()
#7  0x005ed6fa in standard_ExecutorRun ()
#8  0x00713622 in ProcessQuery ()
#9  0x00713885 in PortalRunMulti ()
#10 0x007143a5 in PortalRun ()
#11 0x00711cf1 in PostgresMain ()
#12 0x006a708b in backend_main_proc ()
#13 0x7f7e0325a36d in start_thread () from /lib64/libpthread.so.0
#14 0x7f7e02870b8f in clone () from /lib64/libc.so.6

Thread 995 (Thread 0x7f6e08912700 (LWP 61343)):

Re: Postgres with pthread

2017-12-07 Thread Craig Ringer
On 8 December 2017 at 03:58, Andres Freund  wrote:

> On 2017-12-07 11:26:07 +0800, Craig Ringer wrote:
> > PostgreSQL's architecture conflates "connection", "session" and
> "executor"
> > into one somewhat muddled mess.
>
> How is the executor entangled in the other two?
>
>
Executor in the postgres sense isn't, so I chose the word poorly.

"Engine of execution" maybe. What I'm getting at is that we tie up more
resources than should ideally be necessary when a session is idle,
especially idle in transaction. But I guess a lot of that is really down to
memory allocated and not returned to the OS (because like other C programs
we can't do that), etc. The key resources like PGXACT entries aren't
something we can release while idle in a transaction after all.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres with pthread

2017-12-07 Thread Andres Freund
Hi,

On 2017-12-07 20:48:06 +, Greg Stark wrote:
> But then I thought about it a bit and I do wonder. I don't know how
> well we test having multiple portals doing all kinds of different
> query plans with their execution interleaved.

Cursors test that pretty well.


> And I definitely have doubts whether you can start SPI sessions from
> arbitrary points in the executor expression evaluation and don't know
> what state you can leave and resume them from on subsequent
> evaluations...

SPI being weird doesn't really have that much bearing on the executor
structure imo. But I'm unclear what you'd use SPI for that really
necessitates that. We don't suspend execution it the middle of function
execution...

Greetings,

Andres Freund



Re: Postgres with pthread

2017-12-07 Thread Andres Freund
On 2017-12-07 11:26:07 +0800, Craig Ringer wrote:
> PostgreSQL's architecture conflates "connection", "session" and "executor"
> into one somewhat muddled mess.

How is the executor entangled in the other two?

Greetings,

Andres Freund



Re: Postgres with pthread

2017-12-07 Thread Robert Haas
On Wed, Dec 6, 2017 at 10:20 PM, Craig Ringer  wrote:
> Personally I think it's a pity we didn't land up here before the foundations
> for parallel query went in - DSM, shm_mq, DSA, etc. I know the EDB folks at
> least looked into it though, and presumably there were good reasons to go in
> this direction. Maybe that was just "community will never accept threaded
> conversion" at the time, though.

Yep.  Never is a long time, but it took 3 release cycles to get a
user-visible feature as it was, and if I'd tried to insist on a
process->thread conversion first I suspect we'd still be stuck on that
point today.  Perhaps we would have gotten as far as getting that much
done, but that wouldn't make parallel query be done on top of it.

> Now we have quite a lot of homebrew infrastructure to consider if we do a
> conversion.
>
> That said, it might in some ways make it easier. shm_mq, for example, would
> likely convert to a threaded backend with minimal changes to callers, and
> probably only limited changes to shm_mq its self. So maybe these
> abstractions will prove to have been a win in some ways. Except DSA, and
> even then it could serve as a transitional API...

Yeah, I don't feel too bad about what we've built.  Even if it
ultimately goes away, it will have served the useful purpose of
proving that parallel query is a good idea and can work.  Besides,
shm_mq is just a ring buffer for messages; that's not automatically
something that we don't want just because we move to threads.  If it
goes away, which I think not unlikely, it'll be because something else
is faster.

Also, it's not as if only parallel query structures might have been
designed differently if we had been using threads all along.
dynahash, for example, is quite unlike most concurrent hash tables and
a big part of the reason is that it has to cope with being situated in
a fixed-size chunk of shared memory.  More generally, the whole reason
there's no cheap, straightforward palloc_shared() is the result of the
current design, and it seems very unlikely we wouldn't have that quite
apart from parallel query.  Install pg_stat_statements without a
server restart?  Yes, please.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Postgres with pthread

2017-12-07 Thread Craig Ringer
On 7 December 2017 at 19:55, Konstantin Knizhnik 
wrote:

>
> Pros:
> 1. Simplified memory model: no need in DSM, shm_mq, DSA, etc
>

shm_mq would remain useful, and the others could only be dropped if you
also dropped process-model support entirely.


> 1. Breaks compatibility with existed extensions and adds more requirements
> for authors of new extension
>

Depends on how much frightening preprocessor magic you're willing to use,
doesn't it? ;)

Wouldn't be surprised if simple extensions (C functions etc) stayed fairly
happy, but it'd be hazardous enough in terms of library use etc that
deliberate breakage may be beter.


> 2. Problems with integration of single-threaded PLs: Python, Lua,...
>

Yeah, that's going to hurt. Especially since most non-plpgsql code out
there will be plperl and plpython. Breaking that's not going to be an
option, but nobody's going to be happy if all postgres backends must
contend for the same Python GIL. Plus it'd be deadlock-city.

That's nearly a showstopper right there. Especially since with a quick look
around it looks like the cPython GIL is per-DLL (at least on Windows) not
per-interpreter-state, so spawning separate interpreter states per-thread
may not be sufficient. That makes sense given that cPython its self is
thread-aware; otherwise it'd have a really hard time figuring out which GIL
and interpreter state to look at when in a cPython-spawned thread.


> 3. Worser protection from programming errors, included errors in
> extensions.
>

Mainly contaminating memory of unrelated procesess, or the postmaster.

I'm not worried about outright crashes. On any modern system it's not
significantly worse to take down the postmaster than it is to have it do
its own recovery. A modern init will restart it promptly. (If you're not
running postgres under an init daemon for production then... well, you
should be.)


> 4. Lack of explicit separation of shared and privite memory leads to more
> synchronization errors.
>

Accidentally clobbering postmaster memory/state would be my main worry
there.

Right now we gain a lot of protection from our copy-on-write
shared-nothing-by-default model, and we rely on it in quite a lot of places
where backends merrily stomp on inherited postmaster state.

The more I think about it, the less enthusiastic I am, really.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres with pthread

2017-12-07 Thread Konstantin Knizhnik



On 07.12.2017 00:58, Thomas Munro wrote:

Using a ton of thread local variables may be a useful stepping stone,
but if we want to be able to separate threads/processes from sessions
eventually then I guess we'll want to model sessions as first class
objects and pass them around explicitly or using a single TLS variable
current_session.


It was my primary intention.
Unfortunately separating all static variables into some kind of session 
context requires much more efforts:

we have to change all accesses to such variables.

But please notice, that from performance point of view access to 
__thread variables is not more expensive then access to static variable or
access to fields of session context structure through current_session.  
And there is no extra space overhead for them.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Postgres with pthread

2017-12-07 Thread Konstantin Knizhnik

Hi

On 06.12.2017 20:08, Andres Freund wrote:


4. Rewrite file descriptor cache to be global (shared by all threads).
That one I'm very unconvinced of, that's going to add a ton of new
contention.


Do you mean lock contention because of mutex I used to synchronize 
access to shared file descriptor cache

or contention for file descriptors?
Right now each thread has its own virtual file descriptors, so them are 
not shared between threads.
But there is common LRU, restricting total number of opened descriptors 
in the process.


Actually I have not other choice if I want to support thousands of 
connection.
If each thread has its own private descriptor cache (as it is now for 
processes) and its size is estimated base on open file quota,

then there will be millions of opened file descriptors.

Concerning contention for mutex, I do not think that it is a problem.
At least I have to say that performance (with 100 connections) is 
significantly improved and shows almost the same speed as for 10 
connections

after I have rewritten file descriptor can and made it global
(my original implementation just made all fd.c static variables as 
thread local, so each thread has its separate pool).


It is possible to go further and shared file descriptors between threads 
and use pwrite/pread instead of seek+read/write.

But we still need mutex to implement LRU l2list and free handler list.



Re: Postgres with pthread

2017-12-06 Thread Simon Riggs
> But it is a theory. The main idea of this prototype was to prove or disprove
> this expectation at practice.

> But please notice that it is very raw prototype. A lot of stuff is not
> working yet.

> And supporting all of exited Postgres functionality requires
> much more efforts (and even more efforts are needed for optimizing Postgres
> for this architecture).
>
> I just want to receive some feedback and know if community is interested in
> any further work in this direction.

Looks good. You are right, it is a theory. If your prototype does
actually show what we think it does then it is a good and interesting
result.

I think we need careful analysis to show where these exact gains come
from. The actual benefit is likely not evenly distributed across the
list of possible benefits. Did they arise because you produced a
stripped down version of Postgres? Or did they arise from using
threads?

It would not be the first time a result shown in protoype did not show
real gains on a completed project.

I might also read your results to show that connection concentrators
would be a better area of work, since 100 connections perform better
than 1000 in both cases, so why bother optimising for 1000 connections
at all? In which case we should read the benefit at the 100
connections line, where it shows the lower 28% gain, closer to the
gain your colleague reported.

So I think we don't yet have enough to make a decision.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Postgres with pthread

2017-12-06 Thread Craig Ringer
On 7 December 2017 at 11:44, Tsunakawa, Takayuki <
tsunakawa.ta...@jp.fujitsu.com> wrote:

> From: Craig Ringer [mailto:cr...@2ndquadrant.com]
> >   I'd personally expect that an immediate conversion would result
> > in very
> >   little speedup, a bunch of code deleted, a bunch of complexity
> >   added. And it'd still be massively worthwhile, to keep medium to
> > long
> >   term complexity and feature viability in control.
>
> +1
> I hope for things like:
>


> * More performance statistics like system-wide LWLock waits, without the
> concern about fixed shared memory size
>
* Dynamic memory sizing, such as shared_buffers, work_mem,
> maintenance_work_mem
>

I'm not sure how threaded operations would help us much there. If we could
split shared_buffers into extents we could do this with something like dsm
already. Without the ability to split it into extents, we can't do it with
locally malloc'd memory in a threaded system either.

Re performance diagnostics though, you can already get a lot of useful data
from PostgreSQL's SDT tracepoints, which are usable with perf and DTrace
amongst other tools. Dynamic userspace 'perf' probes can tell you a lot too.

I'm confident you could collect some seriously useful data with perf
tracepoints and 'perf script' these days. (BTW, I extended the
https://wiki.postgresql.org/wiki/Profiling_with_perf article a bit
yesterday with some tips on this).

Of course better built-in diagnostics would be nice. But I really don't see
how it'd have much to do with threaded vs forked model of execution; we can
allocate chunks of memory with dsm now, after all.


> * Running multi-threaded components in postgres extension (is it really
> safe to run JVM for PL/Java in a single-threaded postgres?)
>

PL/Java is a giant mess for so many more reasons than that. The JVM is a
heavyweight  startup, lightweight thread model system. It doesn't play at
all well with postgres's lightweight process fork()-based CoW model. You
can't fork() the JVM because fork() doesn't play nice with threads, at all.
So you have to start it in each backend individually, which is just awful.

One of the nice things if Pg got a threaded model would be that you could
embed a JVM, Mono/.NET runtime, etc and have your sessions work together in
ways you cannot currently sensibly do. Folks using MS SQL, Oracle, etc are
pretty used to being able to do this, and while it should be done with
caution it can offer huge benefits for some complex workloads.

Right now if a PostgreSQL user wants to do anything involving IPC, shared
data, etc, we pretty much have to write quite complex C extensions to do it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres with pthread

2017-12-06 Thread Craig Ringer
On 7 December 2017 at 05:58, Thomas Munro 
wrote:

>
> Using a ton of thread local variables may be a useful stepping stone,
> but if we want to be able to separate threads/processes from sessions
> eventually then I guess we'll want to model sessions as first class
> objects and pass them around explicitly or using a single TLS variable
> current_session.
>
>
Yep.

This is the real reason I'm excited by the idea of a threading conversion.

PostgreSQL's architecture conflates "connection", "session" and "executor"
into one somewhat muddled mess. I'd love to be able to untangle that to the
point where we can pool executors amongst active queries, while retaining
idle sessions' state properly even while they're in a transaction.

Yeah, that's a long way off, but it'd be a whole lot more practical if we
didn't have to serialize and deserialize the entire session state to do it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres with pthread

2017-12-06 Thread Craig Ringer
On 7 December 2017 at 01:17, Andres Freund  wrote:


>
> > I think you've done us a very substantial service by pursuing
> > this far enough to get some quantifiable performance results.
> > But now that we have some results in hand, I think we're best
> > off sticking with the architecture we've got.
>
> I don't agree.
>
> I'd personally expect that an immediate conversion would result in very
> little speedup, a bunch of code deleted, a bunch of complexity
> added. And it'd still be massively worthwhile, to keep medium to long
> term complexity and feature viability in control.
>

Personally I think it's a pity we didn't land up here before the
foundations for parallel query went in - DSM, shm_mq, DSA, etc. I know the
EDB folks at least looked into it though, and presumably there were good
reasons to go in this direction. Maybe that was just "community will never
accept threaded conversion" at the time, though.

Now we have quite a lot of homebrew infrastructure to consider if we do a
conversion.

That said, it might in some ways make it easier. shm_mq, for example, would
likely convert to a threaded backend with minimal changes to callers, and
probably only limited changes to shm_mq its self. So maybe these
abstractions will prove to have been a win in some ways. Except DSA, and
even then it could serve as a transitional API...

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Postgres with pthread

2017-12-06 Thread Thomas Munro
On Thu, Dec 7, 2017 at 6:08 AM, Andres Freund  wrote:
> On 2017-12-06 19:40:00 +0300, Konstantin Knizhnik wrote:
>> As far as I remember, several years ago when implementation of intra-query
>> parallelism was just started there was discussion whether to use threads or
>> leave traditional Postgres process architecture. The decision was made to
>> leave processes. So now we have bgworkers, shared message queue, DSM, ...
>> The main argument for such decision was that switching to threads will
>> require rewriting of most of Postgres code.
>
>> It seems to be quit reasonable argument and and until now I agreed with it.
>>
>> But recently I wanted to check it myself.
>
> I think that's something pretty important to play with. There've been
> several discussions lately, both on and off list / in person, that we're
> taking on more-and-more technical debt just because we're using
> processes. Besides the above, we've grown:
> - a shared memory allocator
> - a shared memory hashtable
> - weird looking thread aware pointers
> - significant added complexity in various projects due to addresses not
>   being mapped to the same address etc.

Yes, those are all workarounds for an ancient temporary design choice.
To quote from a 1989 paper[1] "Currently, POSTGRES runs as one process
for each active user. This was done as an expedient to get a system
operational as quickly as possible. We plan on converting POSTGRES to
use lightweight processes [...]".  +1 for sticking to the plan.

While personally contributing to the technical debt items listed
above, I always imagined that all that machinery could become
compile-time options controlled with --with-threads and
dsa_get_address() would melt away leaving only a raw pointers, and
dsa_area would forward to the MemoryContext + ResourceOwner APIs, or
something like that.  It's unfortunate that we lose type safety along
the way though.  (If only there were some way we could write
dsa_pointer.  In fact it was also a goal of the original
project to adopt C++, based on a comment in 4.2's nodes.h: "Eventually
this code should be transmogrified into C++ classes, and this is more
or less compatible with those things.")

If there were a good way to reserve (but not map) a large address
range before forking, there could also be an intermediate build mode
that keeps the multi-process model but where DSA behaves as above,
which might an interesting way to decouple the
DSA-go-faster-and-reduce-tech-debt project from the threading project.
We could manage the reserved address space ourselves and map DSM
segments with MAP_FIXED, so dsa_get_address() address decoding could
be compiled away.  One way would be to mmap a huge range backed with
/dev/zero, and then map-with-MAP_FIXED segments over the top of it and
then remap /dev/zero back into place when finished, but that sucks
because it gives you that whole mapping in your core files and relies
on overcommit which we don't like, hence my interest in a way to
reserve but not map.

>> The first problem with porting Postgres to pthreads is static variables
>> widely used in Postgres code.
>> Most of modern compilers support thread local variables, for example GCC
>> provides __thread keyword.
>> Such variables are placed in separate segment which is address through
>> segment register (at Intel).
>> So access time to such variables is the same as to normal static variables.
>
> I experimented similarly. Although I'm not 100% sure that if were to go
> for it, we wouldn't instead want to abstract our session concept
> further, or well, at all.

Using a ton of thread local variables may be a useful stepping stone,
but if we want to be able to separate threads/processes from sessions
eventually then I guess we'll want to model sessions as first class
objects and pass them around explicitly or using a single TLS variable
current_session.

> I think the biggest problem with doing this for real is that it's a huge
> project, and that it'll take a long time.
>
> Thanks for working on this!

+1

[1] http://db.cs.berkeley.edu/papers/ERL-M90-34.pdf

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Postgres with pthread

2017-12-06 Thread Andres Freund
Hi,

On 2017-12-06 12:28:29 -0500, Robert Haas wrote:
> > Possibly we even want to continue having various
> > processes around besides that, the most interesting cases involving
> > threads are around intra-query parallelism, and pooling, and for both a
> > hybrid model could be beneficial.
> 
> I think if we only use threads for intra-query parallelism we're
> leaving a lot of money on the table.  For example, if all
> shmem-connected backends are using the same process, then we can make
> max_locks_per_transaction PGC_SIGHUP.  That would be sweet, and there
> are probably plenty of similar things.  Moreover, if threads are this
> thing that we only use now and then for parallel query, then our
> support for them will probably have bugs.  If we use them all the
> time, we'll actually find the bugs and fix them.  I hope.

I think it'd make a lot of sense to go there gradually. I agree that we
probably want to move to more and more use of threads, but we also want
our users not to kill us ;). Initially we'd surely continue to use
partitioned dynahash for locks, which'd make resizing infeasible
anyway. Similar for shared buffers (which I find a hell of a lot more
interesting to change at runtime than max_locks_per_transaction), etc...

- Andres



Re: Postgres with pthread

2017-12-06 Thread Andreas Karlsson

On 12/06/2017 06:08 PM, Andres Freund wrote:

I think the biggest problem with doing this for real is that it's a huge
project, and that it'll take a long time.


An additional issue is that this could break a lot of extensions and in 
a way that it is not apparent at compile time. This means we may need to 
break all extensions to force extensions authors to check if they are 
thread safe.


I do not like making life hard for out extension community, but if the 
gains are big enough it might be worth it.



Thanks for working on this!


Seconded.

Andreas



Re: Postgres with pthread

2017-12-06 Thread Adam Brusselback
> "barely a 50% speedup" - Hah. I don't believe the numbers, but that'd be
> huge.
They are numbers derived from a benchmark that any sane person would
be using a connection pool for in a production environment, but
impressive if true none the less.



Re: Postgres with pthread

2017-12-06 Thread Andres Freund
Hi,

On 2017-12-06 11:53:21 -0500, Tom Lane wrote:
> Konstantin Knizhnik  writes:
> However, if I guess at which numbers are supposed to be what,
> it looks like even the best case is barely a 50% speedup.

"barely a 50% speedup" - Hah. I don't believe the numbers, but that'd be
huge.


> That would be worth pursuing if it were reasonably low-hanging
> fruit, but converting PG to threads seems very far from being that.

I don't think immediate performance gains are the interesting part about
using threads. It's rather what their absence adds a lot in existing /
submitted code complexity, and makes some very commonly requested
features a lot harder to implement:

- we've a lot of duplicated infrastructure around dynamic shared
  memory. dsm.c dsa.c, dshash.c etc. A lot of these, especially dsa.c,
  are going to become a lot more complicated over time, just look at how
  complicated good multi threaded allocators are.

- we're adding a lot of slowness to parallelism, just because we have
  different memory layouts in different processes. Instead of just
  passing pointers through queues, we put entire tuples in there. We
  deal with dsm aware pointers.

- a lot of features have been a lot harder (parallelism!), and a lot of
  frequently requested ones are so hard due to processes that they never
  got off ground (in-core pooling, process reuse, parallel worker reuse)

- due to the statically sized shared memory a lot of our configuration
  is pretty fundamentally PGC_POSTMASTER, even though that present a lot
  of administrative problems.

...


> I think you've done us a very substantial service by pursuing
> this far enough to get some quantifiable performance results.
> But now that we have some results in hand, I think we're best
> off sticking with the architecture we've got.

I don't agree.

I'd personally expect that an immediate conversion would result in very
little speedup, a bunch of code deleted, a bunch of complexity
added. And it'd still be massively worthwhile, to keep medium to long
term complexity and feature viability in control.

Greetings,

Andres Freund



Re: Postgres with pthread

2017-12-06 Thread Robert Haas
On Wed, Dec 6, 2017 at 11:53 AM, Tom Lane  wrote:
> barely a 50% speedup.

I think that's an awfully strange choice of adverb.  This is, by its
authors own admission, a rough cut at this, probably with very little
of the optimization that could ultimately done, and it's already
buying 50% on some test cases?  That sounds phenomenally good to me.
A 50% speedup is huge, and chances are that it can be made quite a bit
better with more work, or that it already is quite a bit better with
the right test case.

TBH, based on previous discussion, I expected this to initially be
*slower* but still worthwhile in the long run because of optimizations
that it would let us do eventually with parallel query and other
things.  If it's this much faster out of the gate, that's really
exciting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Postgres with pthread

2017-12-06 Thread Andres Freund
Hi!

On 2017-12-06 19:40:00 +0300, Konstantin Knizhnik wrote:
> As far as I remember, several years ago when implementation of intra-query
> parallelism was just started there was discussion whether to use threads or
> leave traditional Postgres process architecture. The decision was made to
> leave processes. So now we have bgworkers, shared message queue, DSM, ...
> The main argument for such decision was that switching to threads will
> require rewriting of most of Postgres code.

> It seems to be quit reasonable argument and and until now I agreed with it.
> 
> But recently I wanted to check it myself.

I think that's something pretty important to play with. There've been
several discussions lately, both on and off list / in person, that we're
taking on more-and-more technical debt just because we're using
processes. Besides the above, we've grown:
- a shared memory allocator
- a shared memory hashtable
- weird looking thread aware pointers
- significant added complexity in various projects due to addresses not
  being mapped to the same address etc.


> The first problem with porting Postgres to pthreads is static variables
> widely used in Postgres code.
> Most of modern compilers support thread local variables, for example GCC
> provides __thread keyword.
> Such variables are placed in separate segment which is address through
> segment register (at Intel).
> So access time to such variables is the same as to normal static variables.

I experimented similarly. Although I'm not 100% sure that if were to go
for it, we wouldn't instead want to abstract our session concept
further, or well, at all.


> Certainly may be not all compilers have builtin support of TLS and may be
> not at all hardware platforms them are implemented ias efficiently as at
> Intel.
> So certainly such approach decreases portability of Postgres. But IMHO it is
> not so critical.

I'd agree there, but I don't think the project necessarily does.


> What I have done:
> 1. Add session_local (defined as __thread) to definition of most of static
> and global variables.
> I leaved some variables pointed to shared memory as static. Also I have to
> changed initialization of some static variables,
> because address of TLS variable can not be used in static initializers.
> 2. Change implementation of GUCs to make them thread specific.
> 3. Replace fork() with pthread_create
> 4. Rewrite file descriptor cache to be global (shared by all threads).

That one I'm very unconvinced of, that's going to add a ton of new
contention.


> What are the advantages of using threads instead of processes?
> 
> 1. No need to use shared memory. So there is no static limit for amount of
> memory which can be used by Postgres. No need in distributed shared memory
> and other stuff designed to share memory between backends and
> bgworkers.

This imo is the biggest part. We can stop duplicating OS and our own
implementations in a shmem aware way.


> 2. Threads significantly simplify implementation of parallel algorithms:
> interaction and transferring data between threads can be done easily and
> more efficiently.

That's imo the same as 1.


> 3. It is possible to use more efficient/lightweight synchronization
> primitives. Postgres now mostly relies on its own low level sync.primitives
> which user-level implementation
> is using spinlocks and atomics and then fallback to OS semaphores/poll. I am
> not sure how much gain can we get by replacing this primitives with one
> optimized for threads.
> My colleague from Firebird community told me that just replacing processes
> with threads can obtain 20% increase of performance, but it is just first
> step and replacing sync. primitive
> can give much greater advantage. But may be for Postgres with its low level
> primitives it is not true.

I don't believe that that's actually the case to any significant degree.


> 6. Faster backend startup. Certainly starting backend at each user's request
> is bad thing in any case. Some kind of connection pooling should be used in
> any case to provide acceptable performance. But in any case, start of new
> backend process in postgres causes a lot of page faults which have
> dramatical impact on performance. And there is no such problem with threads.

I don't buy this in itself. The connection establishment overhead isn't
largely the fork, it's all the work afterwards. I do think it makes
connection pooling etc easier.


> I just want to receive some feedback and know if community is interested in
> any further work in this direction.

I personally am. I think it's beyond high time that we move to take
advantage of threads.

That said, I don't think just replacing threads is the right thing. I'm
pretty sure we'd still want to have postmaster as a separate process,
for robustness. Possibly we even want to continue having various
processes around besides that, the most interesting cases involving
threads are around intra-query parallelism, and pooling, and for both a
hybrid 

Re: Postgres with pthread

2017-12-06 Thread Adam Brusselback
Here it is formatted a little better.



​
So a little over 50% performance improvement for a couple of the test cases.



On Wed, Dec 6, 2017 at 11:53 AM, Tom Lane  wrote:

> Konstantin Knizhnik  writes:
> > Below are some results (1000xTPS) of select-only (-S) pgbench with scale
> > 100 at my desktop with quad-core i7-4770 3.40GHz and 16Gb of RAM:
>
> > ConnectionsVanilla/default   Vanilla/prepared
> > pthreads/defaultpthreads/prepared
> > 10100 191
> > 106 207
> > 100  67 131
> > 105 168
> > 100041 65
> > 55   102
>
> This table is so mangled that I'm not very sure what it's saying.
> Maybe you should have made it an attachment?
>
> However, if I guess at which numbers are supposed to be what,
> it looks like even the best case is barely a 50% speedup.
> That would be worth pursuing if it were reasonably low-hanging
> fruit, but converting PG to threads seems very far from being that.
>
> I think you've done us a very substantial service by pursuing
> this far enough to get some quantifiable performance results.
> But now that we have some results in hand, I think we're best
> off sticking with the architecture we've got.
>
> regards, tom lane
>
>