I would like to redesign the InnoDB redo log format for better performance in MariaDB 10.4. Part of this would involve minimizing write amplification and optimizing for journaled file systems.
Jun Su from Microsoft suggested to me that regular writes (as opposed to appends) could cause some write amplification inside journaled file systems. InnoDB traditionally pre-allocates both data and log files. Maybe it was a good idea in 1994 when the code was initially conceived. But we have had journaled or copy-on-write file systems and also SSDs for quite some time now. I wrote two test programs that write a 2GiB file in 2KiB blocks, either pre-allocating the file upfront, or appending to the file. On the two SSDs that I tested (with ext4fs), appending was always faster. The programs are attached to https://jira.mariadb.org/browse/MDEV-14425 I would appreciate it if someone can provide a counterexample where writing to a preallocated file would be faster than appending, on a modern file system. Also, I would like to see how HDDs would perform. With best regards, Marko -- Marko Mäkelä, Lead Developer InnoDB MariaDB Corporation DON’T MISS M|18 MariaDB User Conference February 26 - 27, 2018 New York City https://m18.mariadb.com/ _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : firstname.lastname@example.org Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp