Dear Hugo, I am terribly sorry for the delay, but I thought that I would not reply until there has been some progress on this.
On Mon, Sep 22, 2025 at 10:09 PM Wen, Hugo via developers <[email protected]> wrote: > > Dear MariaDB Developers, > > We've been working with the new buffer pool allocation options in MariaDB > releases (10.11.12/13/14,11.4.6/7/8, and 11.8.2/3), and we've encountered > several concerning issues that we believe need immediate attention. > > Could we revisit the solution and make proper plan to improve this feature? > Related Jiras: https://jira.mariadb.org/browse/MDEV-37557 , > https://jira.mariadb.org/browse/MDEV-37176 > Summarizing the main issues: > > The default value of innodb_buffer_pool_size_max equals > innodb_buffer_pool_size, and prevents users from increasing buffer pool size > dynamically. This appears to be a significant regression. Yes, you would have to start the server with a value of innodb_buffer_pool_size_max that corresponds to the maximum that you will need during the lifetime of the server process. The InnoDB buffer pool will be a contiguous chunk of virtual address space, and the actual innodb_buffer_pool_size will be allocated within that virtual address range. The reason why https://jira.mariadb.org/browse/MDEV-29445 was implemented in a point release is that there was some recent interest in improving the performance of the InnoDB adaptive hash index. The overcomplicated memory structures of the MySQL 5.7/MariaDB 10.2 InnoDB buffer pool resizing were hurting not only performance, but also correctness. The heap-use-after-free issue https://jira.mariadb.org/browse/MDEV-28123 as well as intermittent regression test failures https://jira.mariadb.org/browse/MDEV-35485 were fixed by the simplified design. > Another regression is that the server will fail to start if a user upgrades > to the new versions with innodb_buffer_pool_size originally set to a value > higher than available memory size. > > During initialization and resize operations, the engine uses fundamentally > different approaches for memory allocation: > > During initialization, engine reserves a contiguous virtual memory address > range without immediately committing physical memory. > When setting the buffer pool size dynamically, the engine attempts to commit > additional memory from the previously reserved space. This immediately > attempts to allocate additional physical memory This I believe was reported in https://jira.mariadb.org/browse/MDEV-36780 and fixed in May 2025. I wrote the following comment in that ticket: If someone disagrees about the MAP_POPULATE during SET GLOBAL innodb_buffer_pool_size, we could introduce a configuration parameter for this, say, SET GLOBAL innodb_buffer_pool_overcommit=ON. Then, when we actually run out of memory when trying to use the over-committed buffer pool, the mariadbd process could be killed. Currently, the prefaulting tries to ensure that the requested memory actually will be available for the buffer pool; if not, InnoDB will back off to the current buffer pool size. Would you prefer to have such a configuration parameter? > During buffer pool resizing operations, the engine can crash when memory > conditions aren't ideal due to the above, and behavior differs across > different operating systems. (Engine Crashes During Buffer Pool Resizing in > MDEV-37557) There already was https://jira.mariadb.org/browse/MDEV-9236 reported for something similar. Do you believe that this is a different scenario? One thing that I would like to point out is that the old buffer pool resizing was prone to cause a hang of the entire server if an attempt to shrink the buffer pool too much was made. Now the logic should be more robust. The new implementation tries to detect when we are badly running out of the buffer pool, and may abort the buffer pool resizing operation in order to rescue ourselves from an imminent deadlock. As you may be aware, a deadlock is unavoidable with a tiny buffer pool in a scenario where multiple concurrent client connections are holding some page latches while attempting to load further pages into the buffer pool. If there are enough such threads, then the entire buffer pool will consist of latched dirty pages. The page latches would prevent the pages from being written back to data files. Because dirty pages cannot be evicted, the worst case scenario is that the entire buffer pool is filled with dirty latched pages, as well as possibly some pages allocated for the adaptive hash index or explicit record locks. For example, if all threads are executing locking table scans on different tables, then https://jira.mariadb.org/browse/MDEV-24813 could allow this scenario to occur. > There is a documentation gap. The new parameters > (innodb_buffer_pool_size_max, innodb_buffer_pool_size_auto_min, > innodb_log_checkpoint_now) were introduced with barely any explanation. Users > have had no way to understand these behaviors due to missing documentation. A > Jira ticket was created in July for documentation and is still in an open > state (https://jira.mariadb.org/browse/MDEV-37176). We noticed some > documentation was recently added in > https://mariadb.com/docs/server/server-usage/storage-engines/innodb/innodb-system-variables, > but some does not match the behavior. > > For example, the innodb_buffer_pool_size_max default value of 134217728 > (128MiB) appears misleading. The description "Maximum > innodb_buffer_pool_size" doesn't explain much. MariaDB had switched documentation systems earlier this year. I had in fact filed an internal documentation request 4 days before that ticket had been filed, but our documentation team had been busy with other tasks. Finally, after I sent some reminders in the past month or two, the documentation was updated yesterday. With best regards, Marko -- Marko Mäkelä, Lead Developer InnoDB MariaDB plc _______________________________________________ developers mailing list -- [email protected] To unsubscribe send an email to [email protected]
