Although probably not the root cause, at the least I would set up hugepages ( https://www.postgresql.org/docs/9.6/static/kernel-resources.html#LINUX-HUGE-PAGES ), and bump effective_io_concurrency up quite a bit as well (256 ?).
On Mon, Jul 10, 2017 at 10:03 AM, Charles Nadeau <charles.nad...@gmail.com> wrote: > I’m running PostgreSQL 9.6.3 on Ubuntu 16.10 (kernel 4.4.0-85-generic). > Hardware is: > > *2x Intel Xeon E5550 > > *72GB RAM > > *Hardware RAID10 (4 x 146GB SAS 10k) P410i controller with 1GB FBWC (80% > read/20% write) for Postgresql data only: > > Logical Drive: 3 > > Size: 273.4 GB > > Fault Tolerance: 1+0 > > Heads: 255 > > Sectors Per Track: 32 > > Cylinders: 65535 > > Strip Size: 128 KB > > Full Stripe Size: 256 KB > > Status: OK > > Caching: Enabled > > Unique Identifier: 600508B1001037383941424344450A00 > > Disk Name: /dev/sdc > > Mount Points: /mnt/data 273.4 GB > > OS Status: LOCKED > > Logical Drive Label: A00A194750123456789ABCDE516F > > Mirror Group 0: > > physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS, 146 GB, OK) > > physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS, 146 GB, OK) > > Mirror Group 1: > > physicaldrive 2I:1:7 (port 2I:box 1:bay 7, SAS, 146 GB, OK) > > physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SAS, 146 GB, OK) > > Drive Type: Data > > Formatted with ext4 with: sudo mkfs.ext4 -E stride=32,stripe_width=64 -v > /dev/sdc1. > > Mounted in /etc/fstab with this line: > "UUID=99fef4ae-51dc-4365-9210-0b153b1cbbd0 > /mnt/data ext4 rw,nodiratime,user_xattr,noatime,nobarrier,errors=remount-ro > 0 1" > > Postgresql is the only application running on this server. > > > Postgresql is used as a mini data warehouse to generate reports and do > statistical analysis. It is used by at most 2 users and fresh data is added > every 10 days. The database has 16 tables: one is 224GB big and the rest > are between 16kB and 470MB big. > > > My configuration is: > > > name | current_setting | source > > ---------------------------------+-------------------------- > ----------------------+---------------------- > > application_name | psql | client > > autovacuum_vacuum_scale_factor | 0 | configuration file > > autovacuum_vacuum_threshold | 2000 | configuration file > > checkpoint_completion_target | 0.9 | configuration file > > checkpoint_timeout | 30min | configuration file > > client_encoding | UTF8 | client > > client_min_messages | log | configuration file > > cluster_name | 9.6/main | configuration file > > cpu_index_tuple_cost | 0.001 | configuration file > > cpu_operator_cost | 0.0005 | configuration file > > cpu_tuple_cost | 0.003 | configuration file > > DateStyle | ISO, YMD | configuration file > > default_statistics_target | 100 | configuration file > > default_text_search_config | pg_catalog.english | configuration file > > dynamic_shared_memory_type | posix | configuration file > > effective_cache_size | 22GB | configuration file > > effective_io_concurrency | 4 | configuration file > > external_pid_file | /var/run/postgresql/9.6-main.pid | configuration file > > lc_messages | C | configuration file > > lc_monetary | en_CA.UTF-8 | configuration file > > lc_numeric | en_CA.UTF-8 | configuration file > > lc_time | en_CA.UTF-8 | configuration file > > listen_addresses | * | configuration file > > lock_timeout | 100s | configuration file > > log_autovacuum_min_duration | 0 | configuration file > > log_checkpoints | on | configuration file > > log_connections | on | configuration file > > log_destination | csvlog | configuration file > > log_directory | /mnt/bigzilla/data/toburn/hp/postgresql/pg_log | > configuration file > > log_disconnections | on | configuration file > > log_error_verbosity | default | configuration file > > log_file_mode | 0600 | configuration file > > log_filename | postgresql-%Y-%m-%d_%H%M%S.log | configuration file > > log_line_prefix | user=%u,db=%d,app=%aclient=%h | configuration file > > log_lock_waits | on | configuration file > > log_min_duration_statement | 0 | configuration file > > log_min_error_statement | debug1 | configuration file > > log_min_messages | debug1 | configuration file > > log_rotation_size | 1GB | configuration file > > log_temp_files | 0 | configuration file > > log_timezone | localtime | configuration file > > logging_collector | on | configuration file > > maintenance_work_mem | 3GB | configuration file > > max_connections | 10 | configuration file > > max_locks_per_transaction | 256 | configuration file > > max_parallel_workers_per_gather | 14 | configuration file > > max_stack_depth | 2MB | environment variable > > max_wal_size | 4GB | configuration file > > max_worker_processes | 14 | configuration file > > min_wal_size | 2GB | configuration file > > parallel_setup_cost | 1000 | configuration file > > parallel_tuple_cost | 0.012 | configuration file > > port | 5432 | configuration file > > random_page_cost | 22 | configuration file > > seq_page_cost | 1 | configuration file > > shared_buffers | 34GB | configuration file > > shared_preload_libraries | pg_stat_statements | configuration file > > ssl | on | configuration file > > ssl_cert_file | /etc/ssl/certs/ssl-cert-snakeoil.pem | configuration file > > ssl_key_file | /etc/ssl/private/ssl-cert-snakeoil.key | configuration file > > statement_timeout | 1000000s | configuration file > > stats_temp_directory | /var/run/postgresql/9.6-main.pg_stat_tmp | > configuration file > > superuser_reserved_connections | 1 | configuration file > > syslog_facility | local1 | configuration file > > syslog_ident | postgres | configuration file > > syslog_sequence_numbers | on | configuration file > > temp_file_limit | 80GB | configuration file > > TimeZone | localtime | configuration file > > track_activities | on | configuration file > > track_counts | on | configuration file > > track_functions | all | configuration file > > unix_socket_directories | /var/run/postgresql | configuration file > > vacuum_cost_delay | 1ms | configuration file > > vacuum_cost_limit | 5000 | configuration file > > vacuum_cost_page_dirty | 200 | configuration file > > vacuum_cost_page_hit | 10 | configuration file > > vacuum_cost_page_miss | 100 | configuration file > > wal_buffers | 16MB | configuration file > > wal_compression | on | configuration file > > wal_sync_method | fdatasync | configuration file > > work_mem | 1468006kB | configuration file > > > The part of /etc/sysctl.conf I modified is: > > vm.swappiness = 1 > > vm.dirty_background_bytes = 134217728 > > vm.dirty_bytes = 1073741824 > > vm.overcommit_ratio = 100 > > vm.zone_reclaim_mode = 0 > > kernel.numa_balancing = 0 > > kernel.sched_autogroup_enabled = 0 > > kernel.sched_migration_cost_ns = 5000000 > > > The problem I have is very poor read. When I benchmark my array with fio I > get random reads of about 200MB/s and 1100IOPS and sequential reads of > about 286MB/s and 21000IPS. But when I watch my queries using pg_activity, > I get at best 4MB/s. Also using dstat I can see that iowait time is at > about 25%. This problem is not query-dependent. > > I backed up the database, I reformated the array making sure it is well > aligned then restored the database and got the same result. > > Where should I target my troubleshooting at this stage? I reformatted my > drive, I tuned my postgresql.conf and OS as much as I could. The hardware > doesn’t seem to have any issues, I am really puzzled. > > Thanks! > > > Charles > > -- > Charles Nadeau Ph.D. >