I’m running PostgreSQL 9.5.4 on a virtual machine for production purposes. It 
runs Ubuntu 16.04.1 LTS 64bit, 32GB RAM, 461GB disk space and 4 x logical CPUs.

Postgres executes the following activities:
- many INSERTS for ETL
- a lot of read and write operations for the main OLTP application

The ETL job is still under development, so I’m launching several sequential 
“tries” in order to get the whole thing working. The ETL procedure consists of 
a lot of inserts packed inside transactions. At the moment each transaction 
consists of 100k inserts, so for a 90mln rows table I get 90mln inserts packed 
in 900 transactions. I know it’s not the best, but JDBC drivers combined with 
Pentaho doesn’t seem to pack more inserts into one, so I get a lot of overhead. 
I can see INSERT, BIND and PARSE called for each insert.. I think it’s Pentaho 
which embeds the INSERT in a parametric query.. I hate Pentaho.. anyway..

The ETL procedure does the following:
1) DROP SCHEMA IF EXISTS data_schema CASCADE;
2) creates the “data_schema” schema and populates it with tables and rows using 
INSERTs as described before;
3) if an error occurs, drop the schema

I’m repeating the previous steps many times because of some Pentaho errors 
which the team is working on in order to get it working. This stresses the WAL 
because the interruption of the process interrupts the current transaction and 
is followed by a DROP SCHEMA .. CASCADE.

After few days since we began debugging the ETL elaboration, the disk filled up 
and the last ETL job was automatically aborted. Note that the DB data directory 
is located on the same root disk at /var/lib/postgresql/9.5/main

What shocked me was that the data directory of Postgres was just 815MB in size 
($ du -h /var/lib/postgresql/9.5/main ) and pg_xlog was 705MB, but the entire 
disk was full ("df -h" returned a disk usage of 100%).

I looked for any postgres activity and only noticed a checkpoint writer process 
that was writing at low speeds (IO usage was about 5%).
Also, "SELECT * FROM pg_stat_activity" returned nothing and the most shocking 
part was that the "du -h /“ command returned 56GB as the total size of files 
stored on the whole disk!!! The same was for “du -ha /“, which returns the 
apparent size. 

The total disk size is 461GB, so how is it possible that “df -h” resulted in 
461GB occupied while “du -h /“ returned just 56GB?

After executing:
$ service postgresql stop
$ service postgresql start

the disk was freed and “df -h” returned a usage of just 16%!

The other questions are:
- how can I prevent the disk from filling up? I’m using the default 
configuration for the WAL (1GB max size).
- how can I tune Postgres to speed up the INSERTs?

The actual configuration is the following:
listen_addresses = 'localhost'
max_connections = 32
shared_buffers = 16GB
work_mem = 128MB
maintenance_work_mem = 512MB
effective_io_concurrency = 10
checkpoint_completion_target = 0.9
cpu_tuple_cost = 0.02
cpu_index_tuple_cost = 0.01
cpu_operator_cost = 0.005
effective_cache_size = 24GB
default_statistics_target = 1000

May be that some of these parameters causes this strange behavior? 
checkpoint_completion_target?

Thanks to everyone for the support.

Best regards,
 Pietro Pugni



Reply via email to