Re: [SPAM] [PERFORM] OS cache management
Yeah, thanks. We have it in count. On Wed, Sep 6, 2017 at 9:45 PM, Moreno Andreo <moreno.and...@evolu-s.it> wrote: > Il 06/09/2017 10:12, Soni M ha scritto: > >> >> >> Let's say I have 10 year data, and commonly used data only the last 1 >> year. This data is quite big, so each table and index file is divided into >> several file in PGDATA/base >> >> May not be relevant to what you asked, but if you want to keep last yeat > data in a "small and fast" dataset separated (physically separated!) by old > data (that's still available, but response times may vary), IMHO you should > consider partitioning... > https://www.postgresql.org/docs/current/static/ddl-partitioning.html > > HTH, > Moreno.- > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Regards, Soni Maula Harriz
Re: [PERFORM] OS cache management
In our environment, OS cache is much bigger than postgres buffers. Postgres buffers around 8 GB, OS cache more than 100 GB. Maybe we should inspect pgfincore On Wed, Sep 6, 2017 at 9:13 PM, Gerardo Herzig <gher...@fmed.uba.ar> wrote: > > > - Mensaje original - > > De: "Soni M" <diptat...@gmail.com> > > Para: pgsql-performance@postgresql.org > > Enviados: MiƩrcoles, 6 de Septiembre 2017 5:12:26 > > Asunto: [PERFORM] OS cache management > > > > Hello All, I would like to know about how OS cache works for postgres > table > > and index file. > > > > Let's say I have 10 year data, and commonly used data only the last 1 > year. > > This data is quite big, so each table and index file is divided into > > several file in PGDATA/base > > > > Let's say 1 index named order_by_date has relfilenode = 1870772348, and > > it's file consist of 1870772348, 1870772348.1, and 1870772348.2 > > > > And for oftenly queried 1 year data, do ALL files for the order_by_date > > pushed to OS cache ? or it's just 1 file that contains index to this 1 > year > > data. > > > > Postgres has its own cache (defined by the "shared_buffers" variable). > Usually, the unit of movement in and out from the cache is a 8k page > (defined at compilation time), so you cant put it directly in terms of > files. > > There is an extension that can inspect the cache contents: > https://www.postgresql.org/docs/current/static/pgbuffercache.html > > HTH > Gerardo > -- Regards, Soni Maula Harriz
[PERFORM] OS cache management
Hello All, I would like to know about how OS cache works for postgres table and index file. Let's say I have 10 year data, and commonly used data only the last 1 year. This data is quite big, so each table and index file is divided into several file in PGDATA/base Let's say 1 index named order_by_date has relfilenode = 1870772348, and it's file consist of 1870772348, 1870772348.1, and 1870772348.2 And for oftenly queried 1 year data, do ALL files for the order_by_date pushed to OS cache ? or it's just 1 file that contains index to this 1 year data. How about index named order_by_customer, will ALL the index files pushed to OS cache ? Can someone please explain about how OS cache works for this condition. Thanks very much for the explanation. -- Regards, Soni Maula Harriz
Re: [PERFORM] slow index scan performance
Trying on another server, it gives different result. -> Index Scan using response_log_by_activity on public.response_log rl2 (cost=0.00..50.29 rows=17 width=8) (actual time=0.955..0.967 rows=0 loops=30895) Output: rl2.activity_id, rl2.feed_id Index Cond: (rl2.activity_id = rl.activity_id) Filter: rl2.success Buffers: shared hit=2311312 read=132342 -> Index Scan using activity_pkey on public.activity a (cost=0.00..49.79 rows=1 width=12) (actual time=13.747..13.762 rows=1 loops=30892) Output: a.status_id, a.activity_id, a.visit_id Index Cond: (a.activity_id = rl.activity_id) Buffers: shared hit=124463 read=30175 Now, index scan on activity_pkey which take much slower. Can someone please explain these ? Thanks On Tue, Sep 5, 2017 at 8:46 PM, Soni M <diptat...@gmail.com> wrote: > It's Postgres 9.1.24 on RHEL 6.5 > > On Tue, Sep 5, 2017 at 8:24 PM, Soni M <diptat...@gmail.com> wrote: > >> Consider these 2 index scan produced by a query >> >> -> Index Scan using response_log_by_activity on public.response_log rl2 >> (cost=0.00..51.53 rows=21 width=8) (actual time=9.017..9.056 rows=0 >> loops=34098) >>Output: rl2.activity_id, >> rl2.feed_id >>Index Cond: (rl2.activity_id = >> rl.activity_id) >>Filter: rl2.success >>Buffers: shared hit=3357159 >> read=153313 >>-> Index Scan using activity_pkey on >> public.activity a (cost=0.00..51.10 rows=1 width=12) (actual >> time=0.126..0.127 rows=1 loops=34088) >> Output: a.status_id, a.activity_id, >> a.visit_id >> Index Cond: (a.activity_id = >> rl.activity_id) >> Buffers: shared hit=137925 read=32728 >> >> >> And it's size >> >> conscopy=# select pg_size_pretty(pg_relation_siz >> e('response_log_by_activity'::regclass)); >> pg_size_pretty >> >> 7345 MB >> (1 row) >> >> conscopy=# select pg_size_pretty(pg_relation_siz >> e('activity_pkey'::regclass)); >> pg_size_pretty >> >> 8110 MB >> (1 row) >> >> Index scan on response_log_by_activity is far slower. The table has just >> been repacked, and index rebuilt, but still slow. >> >> Is there any other way to make it faster ? >> >> Why Buffers: shared hit=3,357,159 read=153,313 on >> response_log_by_activity is much bigger than Buffers: shared hit=137925 >> read=32728 on activity_pkey while activity_pkey size is bigger ? >> >> -- >> Regards, >> >> Soni Maula Harriz >> > > > > -- > Regards, > > Soni Maula Harriz > -- Regards, Soni Maula Harriz
Re: [PERFORM] slow index scan performance
It's Postgres 9.1.24 on RHEL 6.5 On Tue, Sep 5, 2017 at 8:24 PM, Soni M <diptat...@gmail.com> wrote: > Consider these 2 index scan produced by a query > > -> Index Scan using response_log_by_activity on public.response_log rl2 > (cost=0.00..51.53 rows=21 width=8) (actual time=9.017..9.056 rows=0 > loops=34098) >Output: rl2.activity_id, rl2.feed_id >Index Cond: (rl2.activity_id = > rl.activity_id) >Filter: rl2.success >Buffers: shared hit=3357159 > read=153313 >-> Index Scan using activity_pkey on > public.activity a (cost=0.00..51.10 rows=1 width=12) (actual > time=0.126..0.127 rows=1 loops=34088) > Output: a.status_id, a.activity_id, > a.visit_id > Index Cond: (a.activity_id = > rl.activity_id) > Buffers: shared hit=137925 read=32728 > > > And it's size > > conscopy=# select pg_size_pretty(pg_relation_size('response_log_by_ > activity'::regclass)); > pg_size_pretty > > 7345 MB > (1 row) > > conscopy=# select pg_size_pretty(pg_relation_size('activity_pkey':: > regclass)); > pg_size_pretty > > 8110 MB > (1 row) > > Index scan on response_log_by_activity is far slower. The table has just > been repacked, and index rebuilt, but still slow. > > Is there any other way to make it faster ? > > Why Buffers: shared hit=3,357,159 read=153,313 on response_log_by_activity > is much bigger than Buffers: shared hit=137925 read=32728 on activity_pkey > while activity_pkey size is bigger ? > > -- > Regards, > > Soni Maula Harriz > -- Regards, Soni Maula Harriz
[PERFORM] slow index scan performance
Consider these 2 index scan produced by a query -> Index Scan using response_log_by_activity on public.response_log rl2 (cost=0.00..51.53 rows=21 width=8) (actual time=9.017..9.056 rows=0 loops=34098) Output: rl2.activity_id, rl2.feed_id Index Cond: (rl2.activity_id = rl.activity_id) Filter: rl2.success Buffers: shared hit=3357159 read=153313 -> Index Scan using activity_pkey on public.activity a (cost=0.00..51.10 rows=1 width=12) (actual time=0.126..0.127 rows=1 loops=34088) Output: a.status_id, a.activity_id, a.visit_id Index Cond: (a.activity_id = rl.activity_id) Buffers: shared hit=137925 read=32728 And it's size conscopy=# select pg_size_pretty(pg_relation_size('response_log_by_activity'::regclass)); pg_size_pretty 7345 MB (1 row) conscopy=# select pg_size_pretty(pg_relation_size('activity_pkey'::regclass)); pg_size_pretty 8110 MB (1 row) Index scan on response_log_by_activity is far slower. The table has just been repacked, and index rebuilt, but still slow. Is there any other way to make it faster ? Why Buffers: shared hit=3,357,159 read=153,313 on response_log_by_activity is much bigger than Buffers: shared hit=137925 read=32728 on activity_pkey while activity_pkey size is bigger ? -- Regards, Soni Maula Harriz
Re: [PERFORM] slow server : s_lock and _bt_checkkeys on perf top
Thanks All for the response, finally we figure it out. The slow is due to high number of dead rows on main table, repack these tables wipe out the issue. On Mar 6, 2015 9:31 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, Mar 4, 2015 at 1:31 PM, Soni M diptat...@gmail.com wrote: Hello All, Master db size 1.5 TB All postgres 9.1.13 installed from RHEL package. It has streaming replica and slony replica to another servers. Server performance is slower than usual, before that, there's a big query got cancelled and then performance get slow. No sign of IO wait. on sar, it's %user and %system dominate the cpu usage 01:25:04 PM CPU %user %nice %system %iowait%steal %idle Average:all 51.91 0.00 12.03 0.66 0.00 35.39 on perf top, i saw 18.93% postgres [.] s_lock 10.72% postgres [.] _bt_checkkeys almost always at top. I don't have any idea, what's causing it or how to resolve it ? Can you post the entire 'perf top'? do you see (specifically I'm wondering if you are bumping against the RecoveryInProgress s_lock issue). If so, upgrading postgres might be the best way to resolve the issue. merlin
[PERFORM] slow server : s_lock and _bt_checkkeys on perf top
Hello All, Master db size 1.5 TB All postgres 9.1.13 installed from RHEL package. It has streaming replica and slony replica to another servers. Server performance is slower than usual, before that, there's a big query got cancelled and then performance get slow. No sign of IO wait. on sar, it's %user and %system dominate the cpu usage 01:25:04 PM CPU %user %nice %system %iowait%steal %idle Average:all 51.91 0.00 12.03 0.66 0.00 35.39 on perf top, i saw 18.93% postgres [.] s_lock 10.72% postgres [.] _bt_checkkeys almost always at top. I don't have any idea, what's causing it or how to resolve it ? Any answer is very appreciated. -- Regards, Soni Maula Harriz
Re: [PERFORM] tuning postgresql 9.3.5 and multiple cores
Changing to a higher rate CPU would be more helpful if you run less than 32 queries at a time. On Tue, Aug 26, 2014 at 8:51 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Monday, August 25, 2014, Jeison Bedoya Delgado jeis...@audifarma.com.co wrote: hi, recently i change the hardware of my database 32 cores up to 64 cores and 128GB Ram, but the performance is the same. Perhaps i have to change any parameter in the postgresql.conf?. PostgreSQL does not (yet) automatically parallelize queries. Unless you have more than 32 queries trying to run at the same time, increasing the number of cores from 32 to 64 is unlikely to be useful. Cheers, Jeff -- Regards, Soni Maula Harriz
Re: [PERFORM] Postgres Replaying WAL slowly
Here's what 'perf top' said on streaming replica : Samples: 26K of event 'cpu-clock', Event count (approx.): 19781 95.97% postgres [.] 0x002210f3 0.41% perf [.] 0x0005f225 0.39% libc-2.12.so [.] __strstr_sse2 0.22% libc-2.12.so [.] memchr 0.22% [kernel] [k] kallsyms_expand_symbol 0.18% perf [.] symbols__insert 0.18% [kernel] [k] format_decode 0.15% libc-2.12.so [.] __GI___strcmp_ssse3 0.13% [kernel] [k] string 0.12% [kernel] [k] number 0.12% [kernel] [k] vsnprintf 0.12% libc-2.12.so [.] _IO_vfscanf 0.11% perf [.] dso__find_symbol 0.11% [kernel] [k] _spin_unlock_irqrestore 0.10% perf [.] hex2u64 0.10% postgres [.] hash_search_with_hash_value 0.09% perf [.] rb_next 0.08% libc-2.12.so [.] memcpy 0.07% libc-2.12.so [.] __strchr_sse2 0.07% [kernel] [k] clear_page 0.06% [kernel] [k] strnlen 0.05% perf [.] perf_evsel__parse_sample 0.05% perf [.] rb_insert_color 0.05% [kernel] [k] pointer On Mon, Jun 30, 2014 at 2:05 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 06/29/2014 03:43 PM, Soni M wrote: top and sar says 100% cpu usage of one core, no sign of I/O wait. Hmm, I wonder what it's doing then... If you have perf installed on the system, you can do perf top to get a quick overlook of where the CPU time is spent. - Heikki -- Regards, Soni Maula Harriz
Re: [PERFORM] Postgres Replaying WAL slowly
On Tue, Jul 1, 2014 at 12:14 AM, Andres Freund and...@2ndquadrant.com wrote: My guess it's a spinlock, probably xlogctl-info_lck via RecoveryInProgress(). Unfortunately inline assembler doesn't always seem to show up correctly in profiles... What worked for me was to build with -fno-omit-frame-pointer - that normally shows the callers, even if it can't generate a proper symbol name. Soni: Do you use Hot Standby? Are there connections active while you have that problem? Any other processes with high cpu load? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services It is 96.62% postgres [.] StandbyReleaseLocks as Jeff said. It runs quite long time, more than 5 minutes i think i also use hot standby. we have 4 streaming replica, some of them has active connection some has not. this issue has last more than 4 days. On one of the standby, above postgres process is the only process that consume high cpu load. -- Regards, Soni Maula Harriz
[PERFORM] Postgres Replaying WAL slowly
Hello Everyone ... We have 6 PG 9.1.12 installation, one master (Ubuntu 10.04), one slony slave(Ubuntu 10.04), and four streaming replica (2 on Ubuntu 10.04 and 2 on RHEL 6.5 (Santiago) which lies on different datacenter). All Ubuntu is on same datacenter. Master send wal archive to slony slave. This is the infrastructure description : 200Mbit link between data centers, esx 5.5 on hp blade chassis. proliant gen 7 blades. postgres servers dedicated to esx hosts (no other vms on those esx hosts). 3par disk backends with 4 and 8 Gbit fiber channel connections. 10Gbit ethernet virtual connects on the hp chassis. cisco fabric and network switches. All postgres installed from Ubuntu/RHEL package. Everything works fine until on Thursday we have high load on master, and after that every streaming replica lag further behind the master. Even on night and weekend where all server load is low. But the slony slave is OK at all. We thought it was due to network, so we decide to copy wal files to local of a streaming server, and replaying wal from local. After PG restart, it replays wal on a good speed about 3 seconds per wal file, but as the time goes the speed decreasing. We had 30 seconds per wal file. The worst we get is 3 minutes to replay 1 wal file. The rate of wal produced from master is normal like usual. And also on Thursday we had wal files on pg_xlog on streaming replica server, but no other wal files. This is the configuration : SELECT name, current_setting(name) FROM pg_settings WHERE source NOT IN ('default', 'override'); name | current_setting +--- application_name | psql archive_command| /var/lib/postgresql/scripts/wal_archive %p %f archive_mode | on checkpoint_completion_target | 0.7 checkpoint_segments| 30 client_encoding| UTF8 DateStyle | ISO, MDY default_text_search_config | pg_catalog.english effective_cache_size | 125GB effective_io_concurrency | 3 external_pid_file | /var/run/postgresql/9.1-main.pid hot_standby| on hot_standby_feedback | on lc_messages| en_US.UTF-8 lc_monetary| en_US.UTF-8 lc_numeric | en_US.UTF-8 lc_time| en_US.UTF-8 listen_addresses | * log_checkpoints| on log_connections| on log_destination| csvlog log_directory | pg_log log_disconnections | on log_filename | postgresql-%a.log log_line_prefix| %t log_lock_waits | on log_rotation_age | 1d log_rotation_size | 0 log_temp_files | 100kB log_timezone | localtime log_truncate_on_rotation | on logging_collector | on maintenance_work_mem | 1GB max_connections| 750 max_locks_per_transaction | 900 max_pred_locks_per_transaction | 900 max_stack_depth| 2MB max_wal_senders| 6 port | 5432 shared_buffers | 8GB ssl| on temp_buffers | 64MB TimeZone | America/Chicago unix_socket_directory | /var/run/postgresql wal_keep_segments | 50 wal_level | hot_standby work_mem | 256MB (47 rows) Thanks for any help -- Regards, Soni Maula Harriz
Re: [PERFORM] Postgres Replaying WAL slowly
top and sar says 100% cpu usage of one core, no sign of I/O wait. The database is 1.5TB in size. RAM in master is 145GB, on slave it's differ, some has about 16GB another has 145GB also. nothing suspicious on standby's postgres log. on master's postgres log : WARNING,01000,pgstat wait timeout, ERROR,57014,canceling autovacuum task,automatic vacuum of table consprod._consprod_replication.sl_event ERROR,57014,canceling statement due to statement timeout,, PARSE,2014-06-26 00:39:35 CDT,91/0,0,ERROR,25P02,current transaction is aborted, commands ignored until end of transaction block,,select 1,,, could not receive data from client: Connection reset by peer, the log files is big anyway. if you can specify some pattern to look at the log, that would really help. On Sun, Jun 29, 2014 at 3:31 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 06/29/2014 11:14 AM, Soni M wrote: Everything works fine until on Thursday we have high load on master, and after that every streaming replica lag further behind the master. Even on night and weekend where all server load is low. But the slony slave is OK at all. What does 'top' on the standby say? Is the startup process using 100% of (one) CPU replaying records, or is it waiting for I/O? How large is the database, does it fit in RAM? Any clues in the system or PostgreSQL logs? - Heikki -- Regards, Soni Maula Harriz