On Mon, Jul 13, 2020 at 8:05 PM Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Mon, Jul 13, 2020 at 10:23 AM Henrique Montenegro <typ...@gmail.com> > wrote: > > insert into users_no_dups ( >> created_ts, >> user_id, >> name, >> url >> ) ( >> select >> created_ts, >> user_id, >> name, >> url >> from >> users >> ) on conflict do nothing >> > > Once the size of the only index exceeds shared_buffers by a bit (the > amount of "a bit" depends on your RAM, kernel version, settings > for dirty_background_ratio, dirty_expire_centisecs, and probably other > things, and is not easy to predict) the performance falls off a cliff when > inserting values in a random order. Every insert dirties a random index > leaf page, which quickly gets evicted from shared_buffers to make room for > other random leaf pages to be read in, and then turns into flush calls when > the kernel freaks out about the amount and age of dirty pages held in > memory. > That is interesting to know. I will do some research on those things. > What happens if you add an "ORDER BY user_id" to your above select? > I don't know. I will give it a try right now. > > >> shared_buffers = 8GB >> RAM: 256GB >> > > Or, crank up shared_buffers by a lot. Like, beyond the size of the > growing index, or up to 240GB if the index ever becomes larger than that. > And make the time between checkpoints longer. If the dirty buffers are > retained in shared_buffers longer, chances of them getting dirtied > repeatedly between writes is much higher than if you just toss them to the > kernel and hope for the best. > > I cranked it up to 160GB to see how it goes. Cheers, > > Jeff > I created the partitions as well as mentioned before. I was able to partition the table based on the user_id (found some logic to it). I was transferring the data from the original table (about 280 million records; 320GB) to the new partitioned table and things were going well with write speeds between 30MB/s and 50MB/s. After reading 270GB of the 320GB (in 4 and a half hours) and writing it to the new partitioned table, write speed went down to 7KB/s. It is so frustrating. I will keep the partitions and try your suggestions to see how it goes. I apologize for the long time between replies, it is just that testing this stuff takes 4+ hours each run. If there are any other suggestions of things for me to look meanwhile as well, please keep them coming. Thanks! Henrique