Why not set up a spot EC2 instance with Postgres 10.1, load database from a dump (yes you’ll need to create one from RDS because they don’t provide direct access to dumps/backups; probably you need to get only specific tables) and repeat your actions, closely looking at filesystem.
ср, 16 мая 2018 г. в 13:10, Jonathan Marks <jonathanaveryma...@gmail.com>: > Hi Tom — > > We turned on log_temp_files and since the last stats reset (about a week > ago) we’re seeing 0 temp files altogether (grabbing that info from > pg_stat_database). > > So, as far as we know: > > 1) It’s not WAL > 2) It’s not tempfiles > 3) It’s not the size of the error logs > 4) It’s not the size of the actual rows in the database or the indexes > > Another thread we found suggested pg_subtrans — this seems less likely > because we’ve been able to replicate this across many different types of > connections etc. but thought it might be a potential source. > > Any other system-monitoring queries that we can run that might further > illuminate the issue? > > Thank you! > > > On May 14, 2018, at 3:31 PM, Jonathan Marks < > jonathanaveryma...@gmail.com> wrote: > > > > We’ll turn on log_temp_files and get back to you to see if that’s the > cause. Re: the exact queries — these are just normal INSERTs and UPDATEs. > This occurs as part of normal database operations — i.e., we are processing > 10% of a table and marking changes to a particular row, or happen to be > inserting 5-10% of the table volume with new rows. Whenever we bulk load we > have to drop the indexes because the disk space loss just isn’t tenable. > > > > Re: extra disk space consumption not within PG — the AWS folks can’t > tell me what the problem is because it’s all internal to the PG part of the > instance they can’t access. Doesn’t mean your last suggestion can’t be the > case but makes it slightly less likely. > > > > Any chance that GIN indexes are double-logging? I.e. with fastupdate off > they are still trying to keep track of the changes in the pending list or > something? > > > > Our thought has been temp files for a while, but we’re not sure what we > should do if that turns out to be the case. > > > >> On May 14, 2018, at 3:08 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> > >> [ please keep the list cc'd ] > >> > >> Jonathan Marks <jonathanaveryma...@gmail.com> writes: > >>> Thanks for your quick reply. Here’s a bit more information: > >>> 1) to measure the “size of the database” we run something like `select > datname, pg_size_pretty(pg_database_size(datname)) from pg_database;` I’m > not sure if this includes WAL size. > >>> 2) I’ve tried measuring WAL size with `select sum(size) from > pg_ls_waldir();` — this also doesn’t budge. > >>> 3) Our current checkpoint_timeout is 600s with a > checkpoint_completion_target of 0.9 — what does that suggest? > >> > >> Hmph. Your WAL-size query seems on point, and that pretty much destroys > >> my idea about a WAL emission spike. > >> > >> pg_database_size() should include all regular and temporary > tables/indexes > >> in the named DB. It doesn't include WAL (but we've eliminated that), > nor > >> cluster-wide tables such as pg_database (but those seem pretty unlikely > >> to be at issue), nor non-relation temporary files such as sort/hash temp > >> space. At this point I think we have to focus our attention on what > might > >> be creating large temp files. I do not see anything in the GIN index > code > >> that could do that, especially not if you have fastupdate off. I wonder > >> whether there is something about the particular bulk-insertion queries > >> you're using that could result in large temp files --- which'd make the > >> apparent correlation with GIN index use a mirage, but we're running out > >> of other ideas. You could try enabling log_temp_files to see if there's > >> anything to that. > >> > >> In the grasping-at-straws department: are you quite sure that the extra > >> disk space consumption is PG's to begin with, rather than something > >> outside the database entirely? > >> > >> regards, tom lane > > > > >