On Sat, 28 Dec 2024 at 08:14, James Hunter <james.hunter...@gmail.com> wrote: > 2. We use this backend_work_mem to "adjust" work_mem values used by > the executor. (I don't care about the optimizer right now -- optimizer > just does its best to predict what will happen at runtime.)
While I do want to see improvements in this area, I think "don't care about the optimizer" is going to cause performance issues. The problem is that the optimizer takes into account what work_mem is set to when calculating the costs of work_mem-consuming node types. See costsize.c for usages of "work_mem". If you go and reduce the amount of memory a given node can consume after the costs have been applied then we may end up in a situation where some other plan would have suited much better. There's also the problem with what to do when you chop work_mem down so far that the remaining size is just a pitiful chunk. For now, work_mem can't go below 64 kilobytes. You might think that's a very unlikely situation that it'd be chopped down so far, but with partition-wise join and partition-wise aggregate, we could end up using a work_mem per partition and if you have thousands of partitions then you might end up reducing work_mem by quite a large amount. I think the best solution to this is the memory grant stuff I talked about in [1]. That does require figuring out which nodes will consume the work_mem concurrently so that infrastructure you talked about to do that would be a good step forward towards that, but that's probably not the most difficult part of that idea. I definitely encourage work in this area, but I think what you're proposing might be swapping one problem for another problem. David [1] https://www.postgresql.org/message-id/caaphdvrzacgea1zrea2aio_lai2fqcgok74bzgfbddg4ymw...@mail.gmail.com