Re: Some observations on very slow pg_restore operations

2019-10-03 Thread Justin Pryzby
What is the state of KSM/THP? Did you try disabling them ? I've seen these can cause high iowait (although that was a virtual environment). It would be interesting to see vmstat output. tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent

Re: Some observations on very slow pg_restore operations

2019-10-03 Thread Ogden Brash
Thanks Michael, I am sure that there is some optimization to be found in the config (and below are all the non-default values in the file). I suspect that they won't explain the difference between restoring from nfs vs local drive, but they could certainly speed up my best case. The dataset is so

Re: Some observations on very slow pg_restore operations

2019-10-03 Thread MichaelDBA
Hi Ogden, You didn't mention any details about your postgresql.conf settings.  Why don't you set them optimally for your loads and try again and see if there is any difference.  Make sure you do a DB restart since some of these parameters require it. == pa

Some observations on very slow pg_restore operations

2019-10-03 Thread Ogden Brash
I recently performed a pg_dump (data-only) of a relatively large database where we store intermediate results of calculations. It is approximately 3 TB on disk and has about 20 billion rows. We do the dump/restore about once a month and as the dataset has grown, the restores have gotten very slow.

Re: pg12 - partition by column that might have null values

2019-10-03 Thread Michael Lewis
Just create a partial index on id column where end_date = infinity (or null if you really prefer that pattern) and the system can quickly find the rows that are still most current revision. How many rows do you have in this table? Or foresee ever having? What took you down the road of partitioning

Re: pg12 - partition by column that might have null values

2019-10-03 Thread Mariel Cherkassky
Not sure how data storage is relevant here, I was only focusing on query optimization. Lets say that most of the data isnt moving (history data). However, objects can be changed and therefore new revisions are added and the previous revisions updated (their end_date is updated). If you run queries