Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On 12/13/2015 01:23 PM, Gerhard Wiesinger wrote: On 13.12.2015 21:14, Bill Moran wrote: Wait ... this is a combined HTTP/Postgres server? You didn't mention that earlier, and it's kind of important. What evidence do you have that Postgres is actually the part of this system running out of memory? For me the complete picture doesn't look consistent. I don't see any such evidence in any of your emails, and (based on experience) I find it pretty likely that whatever is running under node is doing something in a horrifically memory-inefficient manner. Since you mention that you see nothing in the PG logs, that makes it even more likely (to me) that you're looking entirely in the wrong place. I'd be willing to bet a steak dinner that if you put the web server on a different server than the DB, that the memory problems would follow the web server and not the DB server. Changes in config: track_activity_query_size = 102400 work_mem = 100MB Ok, we restarted PostgreSQL and had it stopped for seconds, and logged top every second: When PostgreSQL was down nearly all memory was freed, looks good to me. So it is likely that node and other processes are not the cause. Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free Swap: 512M Total, 477M Used, 35M Free, 93% Inuse When PostgreSQL restarted, Inactive was growing fast (~1min): Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M Free Swap: 512M Total, 472M Used, 41M Free, 92% Inuse After some few minutes we are back again at the same situation: Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free Swap: 512M Total, 472M Used, 41M Free, 92% Inuse The steak dinner is mine :-) Donating to the PostgreSQL community :-) To me all the above proves is that this a complete system issue and only with all the parts running do you get a problem. It still does indicate which part or interaction of parts is the issue. This is further muddied by no description of what, if anything, you where doing in the above scenarios. Any further ideas, I don't think this is normal system behaviour. Ciao, Gerhard -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
Hello, some further details from the original FreeBSD 10.1 machine: Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU COMMAND 77941 pgsql 5 200 7925M 7296M usem2 352:34 6.98% postgres: username dbnamee 127.0.0.1(43367) (postgres) 75862 pgsql 1 200 7321M 7266M select 4 7:04 0.00% postgres: checkpointer process(postgres) 75863 pgsql 1 200 7321M 7260M select 4 3:34 0.00% postgres: writer process(postgres) 75867 pgsql 1 200 7323M 7237M sbwait 0 3:19 0.00% postgres: username dbnamee ipaddress(58977) (postgres) 76178 pgsql 1 200 7323M 7237M sbwait 2 3:18 0.00% postgres: username dbnamee ipaddress(35807) (postgres) 75868 pgsql 1 200 7323M 7237M sbwait 0 3:20 0.00% postgres: username dbnamee ipaddress(58978) (postgres) 75884 pgsql 1 200 7323M 7236M sbwait 3 3:20 0.00% postgres: username dbnamee ipaddress(59908) (postgres) 75869 pgsql 1 200 7323M 7236M sbwait 3 3:20 0.00% postgres: username dbnamee ipaddress(58982) (postgres) 76181 pgsql 1 200 7323M 7236M sbwait 2 3:18 0.00% postgres: username dbnamee ipaddress(35813) (postgres) 75883 pgsql 1 200 7323M 7236M sbwait 0 3:20 0.00% postgres: username dbnamee ipaddress(59907) (postgres) 76180 pgsql 1 200 7323M 7236M sbwait 1 3:19 0.00% postgres: username dbnamee ipaddress(35811) (postgres) 76177 pgsql 1 200 7323M 7236M sbwait 1 3:18 0.00% postgres: username dbnamee ipaddress(35712) (postgres) 76179 pgsql 1 200 7323M 7236M sbwait 5 3:18 0.00% postgres: username dbnamee ipaddress(35810) (postgres) 64951 pgsql 1 750 7375M 662M CPU11 0:01 11.47% postgres: username dbnamee 127.0.0.1(32073) (postgres) 64950 pgsql 1 770 7325M 598M CPU66 0:02 16.55% postgres: username dbnamee 127.0.0.1(44251) (postgres) Out of memory: kernel: swap_pager_getswapspace(4): failed kernel: swap_pager_getswapspace(8): failed kernel: swap_pager_getswapspace(3): failed Main issue is IHMO (as far as I understood the FreeBSD Memory system) that 20G are INACTIVE. When I subtract the shared memory, even ~13GB should be available, but they are still allocated but inactive (INACTIVE). INACTIVE memory might be clean or dirty. As we get into out of memory situations it is likely that the memory is dirty (otherwise it would have been reused). Config: Memory: 32GB, Swap: 512MB maintenance_work_mem = 512MB effective_cache_size = 10GB work_mem = 892MB wal_buffers = 8MB checkpoint_segments = 16 shared_buffers = 7080MB max_connections = 80 autovacuum_max_workers = 3 Thnx. Ciao, Gerhard On 13.12.2015 08:49, Gerhard Wiesinger wrote: Hello, We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple worker processes connected via persistent connections to PostgreSQL, they perform just simple queries with SELECT on primary keys and simple INSERTS/UPDATES. Normally nearly all the workers are idle but they still consume the maximum configured work mem on the PostgreSQL server and the memory is also resident. If some other queries get in we get into out of memory situations. So it looks like PostgreSQL has memory leaks. I found a test scenario to reproduce it also on a newer FreeBSD 10.2 VM as well as in a Fedora 23 VM (both with PostgreSQL 9.4.5): Executions in psql with one persisent connection: -- Create the table CREATE TABLE t_random AS SELECT s, md5(random()::text) FROM generate_Series(1,1) s; -- Create the index CREATE INDEX ON t_random(s); -- Restart psql with a new connection: -- Memory goes slighty up after each execution even after canceling: -- Memory leak on FreeBSD 10.2/Fedora 23 and PostgreSQL 9.4.5 on cancel the query or multiple execution SELECT * FROM t_random ORDER BY md5 LIMIT 10; -- Therefore I created a function: CREATE OR REPLACE FUNCTION execmultiplei(IN num int8) RETURNS void AS $$ BEGIN -- RAISE NOTICE 'num=%', num; FOR i IN 1..num LOOP PERFORM * FROM t_random WHERE s = i; END LOOP; END; $$ LANGUAGE plpgsql; -- Test it several times SELECT execmultiplei(1000); -- Linux testing (FreeBSD is similar), relevant part is RES (resident memory): PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND -- after startup of psql 26851 postgres 20 0 2363276 7432 6292 S 0.0 0.2 0:00.00 postgres: postgres postgres [local] idle -- Memory goes up, ok so far 26851 postgres 20 0 2365732 255152 253548 R 99.0 6.3 0:10.77 postgres: postgres postgres [local] SELECT 26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81 postgres: postgres postgres [local] SELECT 26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90 postgres: postgres
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
Hello Bill, Thank you for your response, comments inline: On 13.12.2015 16:05, Bill Moran wrote: On Sun, 13 Dec 2015 09:57:21 +0100 Gerhard Wiesingerwrote: some further details from the original FreeBSD 10.1 machine: Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU COMMAND 77941 pgsql 5 200 7925M 7296M usem2 352:34 6.98% postgres: username dbnamee 127.0.0.1(43367) (postgres) I see no evidence of an actual leak here. Each process is basically using the 7G of shared_buffers you have allocated in the config (which is only 7G _total_ for all processes, since it's shared memory) OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? Out of memory: kernel: swap_pager_getswapspace(4): failed kernel: swap_pager_getswapspace(8): failed kernel: swap_pager_getswapspace(3): failed Main issue is IHMO (as far as I understood the FreeBSD Memory system) that 20G are INACTIVE. When I subtract the shared memory, even ~13GB should be available, but they are still allocated but inactive (INACTIVE). INACTIVE memory might be clean or dirty. As we get into out of memory situations it is likely that the memory is dirty (otherwise it would have been reused). Not quite correct. Inactive memory is _always_ available for re-use. Are you sure that's true? Monitoring inactive memory: cat vm_stat.sh #!/usr/bin/env bash while [ 1 ]; do date +%Y.%m.%d.%H.%M.%S sysctl -a | grep vm.stats.vm. sleep 1 done And even we get out of memory with swap_pager_getswapspace Inactive Memory (from the log file) is around 20GB (doesn't go down or up) vm.stats.vm.v_inactive_count: 5193530 (*4k pages is around 20GB) Then we have 20GB inactive memory, but we still get out of memory with kernel: swap_pager_getswapspace(4): failed. Any ideas why? Config: Memory: 32GB, Swap: 512MB Probably not the cause of this problem, but this is a non-optimal layout. If you're going to use swap at all, it should generally be 2x the size of your RAM for optimal performance under memory pressure. Yes, we will try to disable it totally. Nevertheless why do we get out of memory/Swap? maintenance_work_mem = 512MB effective_cache_size = 10GB work_mem = 892MB I expect that this value is the cause of the problem. The scenario you describe below is sorting a large table on an unindexed column, meaning it will have to use all that work_mem. I'd be interested to see the output of: EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 10; That was only a test query, has nothing to do with production based query. They are mostly SELECT/INSERTS/UPDATES on primary keys. But even without that information, I'd recommend you reduce work_mem to about 16M or so. Why so low? E.g. sorting on reporting or some long running queries are then done on disk and not in memory. wal_buffers = 8MB checkpoint_segments = 16 shared_buffers = 7080MB max_connections = 80 autovacuum_max_workers = 3 [snip] We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple worker processes connected via persistent connections to PostgreSQL, they perform just simple queries with SELECT on primary keys and simple INSERTS/UPDATES. That's not at all the test scenario you show below. The scenario below is a large sort operation on a non-indexed column, which is vastly different than a single-row fetch based on an index. Yes, that non indexed select was just for testing purporeses. Normally nearly all the workers are idle but they still consume the maximum configured work mem on the PostgreSQL server and the memory is also resident. I see no evidence of that in your top output. Each process has a reference to the 7G of shared_buffers you allocated, which is memory shared by all processes, and is expected. I'm not as familiar with Linux top, but the results appear to be the same. Yes, might be the case, but if it is n times 7G shared memory then we have ~20GB Inactive Memory available, so plenty of memory. And why are we getting: kernel: swap_pager_getswapspace(4): failed? Thnx. Ciao, Gerhard -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
Gerhard Wiesingerwrites: >> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free >> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? Just judging from the name of the function, I would bet this is a direct result of having only 512M of swap configured. As Bill already pointed out, that's a pretty useless choice on a system with 32G of RAM. As soon as the kernel tries to push out any significant amount of idle processes, it's gonna be out of swap space. The numbers you show above prove that it is almost out of free swap already. Also, while that 20G of "inactive" pages may be candidates for reuse, they probably can't actually be reused without swapping them out ... and there's noplace for that data to go. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 16:35:08 +0100 Gerhard Wiesingerwrote: > Hello Bill, > > Thank you for your response, comments inline: > > On 13.12.2015 16:05, Bill Moran wrote: > > On Sun, 13 Dec 2015 09:57:21 +0100 > > Gerhard Wiesinger wrote: > >> some further details from the original FreeBSD 10.1 machine: > >> > >> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free > >> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > >> > >> PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU > >> COMMAND > >> 77941 pgsql 5 200 7925M 7296M usem2 352:34 6.98% > >> postgres: username dbnamee 127.0.0.1(43367) (postgres) > > > > > > I see no evidence of an actual leak here. Each process is basically using > > the 7G of shared_buffers you have allocated in the config (which is only > > 7G _total_ for all processes, since it's shared memory) > > OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? You haven't provided enough information to isolate that cause yet. What's in the Postgres log? Surely it will have logged something when its request for RAM was denied, and it should be more informational than the OS' generic message. > >> Out of memory: > >> kernel: swap_pager_getswapspace(4): failed > >> kernel: swap_pager_getswapspace(8): failed > >> kernel: swap_pager_getswapspace(3): failed > >> > >> Main issue is IHMO (as far as I understood the FreeBSD Memory system) > >> that 20G are INACTIVE. When I subtract the shared memory, even ~13GB > >> should be available, but they are still allocated but inactive > >> (INACTIVE). INACTIVE memory might be clean or dirty. As we get into out > >> of memory situations it is likely that the memory is dirty (otherwise it > >> would have been reused). > > Not quite correct. Inactive memory is _always_ available for re-use. > > Are you sure that's true? Yes. Read The Design and Implementation of FreeBSD for the details. > Monitoring inactive memory: > cat vm_stat.sh > #!/usr/bin/env bash > > while [ 1 ]; do >date +%Y.%m.%d.%H.%M.%S >sysctl -a | grep vm.stats.vm. >sleep 1 > done > > And even we get out of memory with swap_pager_getswapspace Inactive > Memory (from the log file) is around 20GB (doesn't go down or up) > vm.stats.vm.v_inactive_count: 5193530 (*4k pages is around 20GB) > > Then we have 20GB inactive memory, but we still get out of memory with > kernel: swap_pager_getswapspace(4): failed. Any ideas why? Theory: If the planner decides it needs to do 30 sort operations for a query, it will try to allocate 27G of RAM, which exceeds what's available, and therefore never gets allocated. So you get the "out of space" message, but the actual memory usage doesn't change. > >> maintenance_work_mem = 512MB > >> effective_cache_size = 10GB > >> work_mem = 892MB > > I expect that this value is the cause of the problem. The scenario you > > describe below is sorting a large table on an unindexed column, meaning > > it will have to use all that work_mem. I'd be interested to see the > > output of: > > > > EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 10; > > That was only a test query, has nothing to do with production based > query. They are mostly SELECT/INSERTS/UPDATES on primary keys. Then provide the _actual_ queries and the EXPLAIN ANALYZE and the table layouts and basic data distribution of the actual cause. If your test case is completely non-representative of what's happening, then you're not going to get useful answers. > > But even without that information, I'd recommend you reduce work_mem > > to about 16M or so. > > Why so low? E.g. sorting on reporting or some long running queries are > then done on disk and not in memory. Even a simple query could involve multiple sorts, and you're allowing each sort to use up to 890M of RAM (which is _not_ shared). As noted earlier, even a moderately complex query could exceed the available RAM on the system. But since you don't provide the actual queries and tables causing problems, I can only guess. And since you appear to have already decided what the cause of the problem is, then crafted completely non-relevent queries that you think prove your point, I'm not sure there's anything I can do to help you. > >> wal_buffers = 8MB > >> checkpoint_segments = 16 > >> shared_buffers = 7080MB > >> max_connections = 80 > >> autovacuum_max_workers = 3 > > [snip] > > > >>> We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple > >>> worker processes connected via persistent connections to PostgreSQL, > >>> they perform just simple queries with SELECT on primary keys and > >>> simple INSERTS/UPDATES. > > That's not at all the test scenario you show below. The scenario below > > is a large sort operation on a non-indexed column, which is vastly > > different than a single-row fetch based on an index. > > Yes, that non indexed select was just for testing purporeses.
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 09:57:21 +0100 Gerhard Wiesingerwrote: > > some further details from the original FreeBSD 10.1 machine: > > Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free > Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > >PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU COMMAND > 77941 pgsql 5 200 7925M 7296M usem2 352:34 6.98% > postgres: username dbnamee 127.0.0.1(43367) (postgres) I see no evidence of an actual leak here. Each process is basically using the 7G of shared_buffers you have allocated in the config (which is only 7G _total_ for all processes, since it's shared memory) > Out of memory: > kernel: swap_pager_getswapspace(4): failed > kernel: swap_pager_getswapspace(8): failed > kernel: swap_pager_getswapspace(3): failed > > Main issue is IHMO (as far as I understood the FreeBSD Memory system) > that 20G are INACTIVE. When I subtract the shared memory, even ~13GB > should be available, but they are still allocated but inactive > (INACTIVE). INACTIVE memory might be clean or dirty. As we get into out > of memory situations it is likely that the memory is dirty (otherwise it > would have been reused). Not quite correct. Inactive memory is _always_ available for re-use. > Config: > Memory: 32GB, Swap: 512MB Probably not the cause of this problem, but this is a non-optimal layout. If you're going to use swap at all, it should generally be 2x the size of your RAM for optimal performance under memory pressure. > maintenance_work_mem = 512MB > effective_cache_size = 10GB > work_mem = 892MB I expect that this value is the cause of the problem. The scenario you describe below is sorting a large table on an unindexed column, meaning it will have to use all that work_mem. I'd be interested to see the output of: EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 10; But even without that information, I'd recommend you reduce work_mem to about 16M or so. > wal_buffers = 8MB > checkpoint_segments = 16 > shared_buffers = 7080MB > max_connections = 80 > autovacuum_max_workers = 3 [snip] > > We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple > > worker processes connected via persistent connections to PostgreSQL, > > they perform just simple queries with SELECT on primary keys and > > simple INSERTS/UPDATES. That's not at all the test scenario you show below. The scenario below is a large sort operation on a non-indexed column, which is vastly different than a single-row fetch based on an index. > > Normally nearly all the workers are idle but > > they still consume the maximum configured work mem on the PostgreSQL > > server and the memory is also resident. I see no evidence of that in your top output. Each process has a reference to the 7G of shared_buffers you allocated, which is memory shared by all processes, and is expected. I'm not as familiar with Linux top, but the results appear to be the same. > > If some other queries get in > > we get into out of memory situations. So it looks like PostgreSQL has > > memory leaks. > > > > I found a test scenario to reproduce it also on a newer FreeBSD 10.2 > > VM as well as in a Fedora 23 VM (both with PostgreSQL 9.4.5): > > > > Executions in psql with one persisent connection: > > -- Create the table > > CREATE TABLE t_random AS SELECT s, md5(random()::text) FROM > > generate_Series(1,1) s; > > -- Create the index > > CREATE INDEX ON t_random(s); > > > > -- Restart psql with a new connection: > > > > -- Memory goes slighty up after each execution even after canceling: > > -- Memory leak on FreeBSD 10.2/Fedora 23 and PostgreSQL 9.4.5 on > > cancel the query or multiple execution > > SELECT * FROM t_random ORDER BY md5 LIMIT 10; > > > > -- Therefore I created a function: > > CREATE OR REPLACE FUNCTION execmultiplei(IN num int8) > > RETURNS void AS $$ > > BEGIN > > -- RAISE NOTICE 'num=%', num; > > FOR i IN 1..num LOOP > > PERFORM * FROM t_random WHERE s = i; > > END LOOP; > > END; > > $$ LANGUAGE plpgsql; > > > > -- Test it several times > > SELECT execmultiplei(1000); > > > > -- Linux testing (FreeBSD is similar), relevant part is RES (resident > > memory): > > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND > > -- after startup of psql > > 26851 postgres 20 0 2363276 7432 6292 S 0.0 0.2 0:00.00 > > postgres: postgres postgres [local] idle > > -- Memory goes up, ok so far > > 26851 postgres 20 0 2365732 255152 253548 R 99.0 6.3 0:10.77 > > postgres: postgres postgres [local] SELECT > > 26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81 > > postgres: postgres postgres [local] SELECT > > 26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90 > > postgres: postgres postgres [local] SELECT > > -- Function execmultiplei and transaction terminated, but memory still > > allocated!!! > > 26851 postgres 20 0
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On 13.12.2015 18:17, Tom Lane wrote: Gerhard Wiesingerwrites: Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? Just judging from the name of the function, I would bet this is a direct result of having only 512M of swap configured. As Bill already pointed out, that's a pretty useless choice on a system with 32G of RAM. As soon as the kernel tries to push out any significant amount of idle processes, it's gonna be out of swap space. The numbers you show above prove that it is almost out of free swap already. The system wasn't designed by me, I wouldn't do it either that way. Does swapoff help? Also, while that 20G of "inactive" pages may be candidates for reuse, they probably can't actually be reused without swapping them out ... and there's noplace for that data to go. There is no log in syslog (where postgres log) when swap_pager_getswapspace is logged. But why do we have 20G of Inactive pages? They are still allocated by kernel or user space. As you can see below (top output) NON Postgres processes are around 9G in virtual size, resident even lower. The system is nearly idle, and the queries typically aren't active after one second agin. Therefore where does the rest of the 11G of Inactive pages come from (if it isn't a Postgres/FreeBSD memory leak)? I read that Postgres has it's own memory allocator: https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/ Might that be an issue with double allocation/freeing and the "cheese hole" topic with memory fragmentation? https://www.opennet.ru/base/dev/fbsdvm.txt.html inactivepages not actively used by programs which are dirty and (at some point) need to be written to their backing store (typically disk). These pages are still associated with objects and can be reclaimed if a program references them. Pages can be moved from the active to the inactive queue at any time with little adverse effect. Moving pages to the cache queue has bigger consequences (note 1) https://unix.stackexchange.com/questions/134862/what-do-the-different-memory-counters-in-freebsd-mean Active: Memory currently being used by a process Inactive: Memory that has been freed but is still cached since it may be used again. If more Free memory is required, this memory can be cleared and become free. This memory is not cleared before it is needed, because "free memory is wasted memory", it doesn't cost anything to keep the old data around in case it is needed again. Wired: Memory in use by the Kernel. This memory cannot be swapped out (GW: including ZFS cache!!!) Cache: Memory being used to cache data, can be freed immediately if required Buffers: Disk cache Free: Memory that is completely free and ready to use. Inactive, Cache and Buffers can become free if they are cleaned up. Thnx. Ciao, Gerhard last pid: 7277; load averages: 0.91, 0.96, 1.02 up 18+06:22:31 18:57:54 135 processes: 2 running, 132 sleeping, 1 waiting Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free Swap: 512M Total, 501M Used, 12M Free, 97% Inuse PID USERNAMETHR PRI NICE SIZERES STATE C TIMEWCPU COMMAND 77941 pgsql 5 200 7921M 7295M usem7 404:32 10.25% postgres 79570 pgsql 1 200 7367M 6968M sbwait 6 4:24 0.59% postgres 73449 pgsql 1 270 7367M 6908M sbwait 4 8:08 5.08% postgres 74209 pgsql 1 270 7367M 6803M sbwait 0 6:37 1.46% postgres 74207 pgsql 1 250 7367M 6748M sbwait 6 6:34 1.76% postgres 74206 pgsql 1 220 7367M 6548M sbwait 5 6:44 1.86% postgres 73380 pgsql 1 230 7367M 7265M sbwait 4 8:15 1.17% postgres 74208 pgsql 1 240 7367M 7223M sbwait 1 6:30 4.59% postgres 79569 pgsql 1 240 7367M 7105M sbwait 3 4:36 1.17% postgres 74210 pgsql 1 290 7363M 7182M sbwait 5 6:41 5.47% postgres 73479 pgsql 1 220 7363M 6560M sbwait 6 7:14 3.56% postgres 83030 pgsql 1 200 7329M 193M sbwait 5 0:00 0.00% postgres 76178 pgsql 1 200 7323M 7245M sbwait 2 3:44 0.00% postgres 75867 pgsql 1 200 7323M 7245M sbwait 2 3:45 0.00% postgres 75869 pgsql 1 200 7323M 7245M sbwait 2 3:46 0.00% postgres 75883 pgsql 1 200 7323M 7245M sbwait 7 3:46 0.00% postgres 76180 pgsql
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On 13.12.2015 21:14, Bill Moran wrote: Wait ... this is a combined HTTP/Postgres server? You didn't mention that earlier, and it's kind of important. What evidence do you have that Postgres is actually the part of this system running out of memory? For me the complete picture doesn't look consistent. I don't see any such evidence in any of your emails, and (based on experience) I find it pretty likely that whatever is running under node is doing something in a horrifically memory-inefficient manner. Since you mention that you see nothing in the PG logs, that makes it even more likely (to me) that you're looking entirely in the wrong place. I'd be willing to bet a steak dinner that if you put the web server on a different server than the DB, that the memory problems would follow the web server and not the DB server. Changes in config: track_activity_query_size = 102400 work_mem = 100MB Ok, we restarted PostgreSQL and had it stopped for seconds, and logged top every second: When PostgreSQL was down nearly all memory was freed, looks good to me. So it is likely that node and other processes are not the cause. Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free Swap: 512M Total, 477M Used, 35M Free, 93% Inuse When PostgreSQL restarted, Inactive was growing fast (~1min): Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M Free Swap: 512M Total, 472M Used, 41M Free, 92% Inuse After some few minutes we are back again at the same situation: Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free Swap: 512M Total, 472M Used, 41M Free, 92% Inuse The steak dinner is mine :-) Donating to the PostgreSQL community :-) Any further ideas, I don't think this is normal system behaviour. Ciao, Gerhard -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 20:09:04 +0100 Gerhard Wiesingerwrote: > On 13.12.2015 18:17, Tom Lane wrote: > > Gerhard Wiesinger writes: > >>> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free > >>> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > >> OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? > > Just judging from the name of the function, I would bet this is a direct > > result of having only 512M of swap configured. As Bill already pointed > > out, that's a pretty useless choice on a system with 32G of RAM. As soon > > as the kernel tries to push out any significant amount of idle processes, > > it's gonna be out of swap space. The numbers you show above prove that > > it is almost out of free swap already. > > The system wasn't designed by me, I wouldn't do it either that way. Does > swapoff help? FreeBSD and Linux (and most modern OS) are designed to have swap, and usually more swap than RAM. I have never heard a good reason for not using swap, and the reasons I _have_ heard have always been by people misinformed about how the OS works. If someone has a _good_ explanation for why you wouldn't want any swap on a DB server, I'd love to hear it; but everything I've heard up till now has been speculation based on misinformation. IOW: no, you should not turn swap off, you should instead allocate the appropriate amount of swap space. > > Also, while that 20G of "inactive" pages may be candidates for reuse, > > they probably can't actually be reused without swapping them out ... > > and there's noplace for that data to go. > > There is no log in syslog (where postgres log) when > swap_pager_getswapspace is logged. > > But why do we have 20G of Inactive pages? They are still allocated by > kernel or user space. As you can see below (top output) NON Postgres > processes are around 9G in virtual size, resident even lower. The system > is nearly idle, and the queries typically aren't active after one second > agin. Therefore where does the rest of the 11G of Inactive pages come > from (if it isn't a Postgres/FreeBSD memory leak)? > I read that Postgres has it's own memory allocator: > https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/ > Might that be an issue with double allocation/freeing and the "cheese > hole" topic with memory fragmentation? If there were a memory leak in either FreeBSD or Postgres of the seriousness you're describing that were as easy to trigger as you claim, I would expect the mailing lists and other support forums to be exploding in panic. Notice that they are not. Also, I still don't see _ANY_ evidence of a leak. I see evidence that something is happening that is trying to allocate a LOT of RAM, that isn't available on your system; but that's not the same as a leak. > https://www.opennet.ru/base/dev/fbsdvm.txt.html > inactivepages not actively used by programs which are > dirty and (at some point) need to be written > to their backing store (typically disk). > These pages are still associated with objects and > can be reclaimed if a program references them. > Pages can be moved from the active to the inactive > queue at any time with little adverse effect. > Moving pages to the cache queue has bigger > consequences (note 1) Correct, but, when under pressure, the system _will_ recycle those pages to be available. Tom might be correct in that the system thinks they are inactive because it could easily push them out to swap, but then it can't _actually_ do that because you haven't allocated enough swap, but that doesn't match my understanding of how inactive is used. A question of that detail would be better asked on a FreeBSD forum, as the differences between different VM implementations can be pretty specific and technical. [snip] > Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free > Swap: 512M Total, 501M Used, 12M Free, 97% Inuse > >PID USERNAMETHR PRI NICE SIZERES STATE C TIMEWCPU COMMAND > 77941 pgsql 5 200 7921M 7295M usem7 404:32 10.25% > postgres > 79570 pgsql 1 200 7367M 6968M sbwait 6 4:24 0.59% postgres [snip about 30 identical PG processes] > 32387 myusername9 200 980M 375M uwait 5 69:03 1.27% node [snip similar processes] >622 myusername1 200 261M 3388K kqread 3 41:01 0.00% nginx [snip similar processes] Wait ... this is a combined HTTP/Postgres server? You didn't mention that earlier, and it's kind of important. What evidence do you have that Postgres is actually the part of this system running out of memory? I don't see any such evidence in any of your emails, and
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 22:23:19 +0100 Gerhard Wiesingerwrote: > On 13.12.2015 21:14, Bill Moran wrote: > > Wait ... this is a combined HTTP/Postgres server? You didn't mention that > > earlier, and it's kind of important. > > > > What evidence do you have that Postgres is actually the part of > > this system running out of memory? > > For me the complete picture doesn't look consistent. That's because you haven't gathered enough of the right type of information. > > I don't see any such evidence in any of > > your emails, and (based on experience) I find it pretty likely that whatever > > is running under node is doing something in a horrifically > > memory-inefficient > > manner. Since you mention that you see nothing in the PG logs, that makes it > > even more likely (to me) that you're looking entirely in the wrong place. > > > > I'd be willing to bet a steak dinner that if you put the web server on a > > different server than the DB, that the memory problems would follow the > > web server and not the DB server. > > Changes in config: > track_activity_query_size = 102400 > work_mem = 100MB > > Ok, we restarted PostgreSQL and had it stopped for seconds, and logged > top every second: > > When PostgreSQL was down nearly all memory was freed, looks good to me. > So it is likely that node and other processes are not the cause. > Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free > Swap: 512M Total, 477M Used, 35M Free, 93% Inuse > > When PostgreSQL restarted, Inactive was growing fast (~1min): > Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M > Free > Swap: 512M Total, 472M Used, 41M Free, 92% Inuse > > After some few minutes we are back again at the same situation: > Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free > Swap: 512M Total, 472M Used, 41M Free, 92% Inuse > > The steak dinner is mine :-) Donating to the PostgreSQL community :-) Based on the fact that the inactive memory increased? Your understanding of inactive memory in FreeBSD is incorrect. Those pages are probably DB pages that the OS is keeping in inactive memory because Postgres requests them over and over, which is what the OS is supposed to do to ensure the best performance. Are you seeing any out of swap space errors? Even if you are, you still haven't determined if the problem is the result of Postgres or the node.js stuff you have running. I don't know what node.js might be caching on the client side ... do you? No. Until you can actually report back something other than wild speculation, I'll keep that steak dinner for myself. Besides, that bet was based on you putting the PG server on seperate hardware from the web server, which you didn't do. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Memory Leak executing small queries without closing the connection
Hello, We are running PostgreSQL 9.4.5 on FreeBSD 10.1 and have multiple worker processes connected via persistent connections to PostgreSQL, they perform just simple queries with SELECT on primary keys and simple INSERTS/UPDATES. Normally nearly all the workers are idle but they still consume the maximum configured work mem on the PostgreSQL server and the memory is also resident. If some other queries get in we get into out of memory situations. So it looks like PostgreSQL has memory leaks. I found a test scenario to reproduce it also on a newer FreeBSD 10.2 VM as well as in a Fedora 23 VM (both with PostgreSQL 9.4.5): Executions in psql with one persisent connection: -- Create the table CREATE TABLE t_random AS SELECT s, md5(random()::text) FROM generate_Series(1,1) s; -- Create the index CREATE INDEX ON t_random(s); -- Restart psql with a new connection: -- Memory goes slighty up after each execution even after canceling: -- Memory leak on FreeBSD 10.2/Fedora 23 and PostgreSQL 9.4.5 on cancel the query or multiple execution SELECT * FROM t_random ORDER BY md5 LIMIT 10; -- Therefore I created a function: CREATE OR REPLACE FUNCTION execmultiplei(IN num int8) RETURNS void AS $$ BEGIN -- RAISE NOTICE 'num=%', num; FOR i IN 1..num LOOP PERFORM * FROM t_random WHERE s = i; END LOOP; END; $$ LANGUAGE plpgsql; -- Test it several times SELECT execmultiplei(1000); -- Linux testing (FreeBSD is similar), relevant part is RES (resident memory): PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND -- after startup of psql 26851 postgres 20 0 2363276 7432 6292 S 0.0 0.2 0:00.00 postgres: postgres postgres [local] idle -- Memory goes up, ok so far 26851 postgres 20 0 2365732 255152 253548 R 99.0 6.3 0:10.77 postgres: postgres postgres [local] SELECT 26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81 postgres: postgres postgres [local] SELECT 26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90 postgres: postgres postgres [local] SELECT -- Function execmultiplei and transaction terminated, but memory still allocated!!! 26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 0:41.40 postgres: postgres postgres [local] idle -- Calling it again 26851 postgres 20 0 2365732 920668 918748 R 99.0 22.7 0:46.51 postgres: postgres postgres [local] SELECT -- idle again, memory still allocated 26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 1:22.54 postgres: postgres postgres [local] idle Memory will only be released if psql is exited. According to the PostgreSQL design memory should be freed when the transaction completed. top commands on FreeBSD: top -SaPz -o res -s 1 top commands on Linux: top -o RES d1 Config: VMs with 4GB of RAM, 2 vCPUs shared_buffers = 2048MB # min 128kB effective_cache_size = 2GB work_mem = 892MB wal_buffers = 8MB checkpoint_segments = 16 Any ideas? Thank you. Ciao, Gerhard -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general