On Aug 14, 2017 14:12, "Chris Travers" <chris.trav...@adjust.com> wrote:
Hi all; I am trying to track down a problem we are seeing that looks very similar to bug #12050, and would certainly consider trying to contribute a fix if we agree on one. (I am not sure we can, so absent that, the next question is whether it makes sense to create a utility to fix the problem when it comes up so that a dump/restore is not needed). The system: PostgreSQL 9.6.3 Gentoo Linux. Problem: The system this came up on is PostgreSQL 9.6.3 and has had repeated trouble with disk space. Querying pg_database_size, as well as du on the subdirectory of base/ show total usage to be around 3.8TB. Summing up the size of the relations in pg_class though shows around 2.1TB. Initial troubleshooting found around 150 GB of space in pg_temp which had never been cleared and was at least several days old. Restarting the server cleared these up. Poking around the base/[oid] directory, I found a large number of files which did not correspond with a pg_class entry. One of the apparent relations was nearly 1TB in size. What I think happened: I think various pg_temp/* and orphaned relation files (In base/[oid]) were created when PostgreSQL crashed due to running out of space in various operations including creating materialised views. So my question is if there is a way we can safely clean these up on server restart? If not does it make sense to try to create a utility that can connect to PostgreSQL, seek out valid files, and delete the rest? Ok I have identified one case where symptoms I am seeing can be reproduced. I am currently working on a Mac so there may be quirks in my repro. However.... When the WAL writer runs out of disk space no cleanup is done. So I will be looking at possible solutions next. -- Best Regards, Chris Travers Database Administrator Tel: +49 162 9037 210 <+49%20162%209037210> | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin