Greetings all. This note announces the next release of GNU Awk: version 5.4.0.
The following files may be retrieved via HTTPS from https://ftp.gnu.org/gnu/gawk: -rw-rw-r-- 1 arnold arnold 6640108 Feb 22 16:46 gawk-5.4.0.tar.gz -rw-rw-r-- 1 arnold arnold 3803276 Feb 22 16:46 gawk-5.4.0.tar.xz -rw-rw-r-- 1 arnold arnold 3611641 Feb 22 16:46 gawk-5.4.0.tar.lz This is a major release. The relevant part of the NEWS file is appended below. A diff file from the previous release is not available. The online manuals have been updated. This release requires particular acknowledgements and thanks to the following people: - Mike Haertel, for the new MinRX matcher. This release is the culmination of over two years of work. More detail on the new matcher is provided below. - Eli Zaretskii, for his Herculean efforts to bring correct behavior for Unicode/UTF-8 to the MinGW port. - John Malmberg for similar efforts in updating the OpenVMS port. Gawk can be built and runs on all three 64-bit OpenVMS architectures: Alpha, Itanium and Intel x86_64. Reviving the VAX port will hopefully occur in a future patch or major release. - The rest of my "crack portability team" also deserves thanks: Corinna Vinschen for Cygwin, Nelson H.F. Beebe for broad portability testing, Daniel Richard G. for z/OS support, Andy Schorr, Jurgen Kahrs, Pat Rankin, Michal Jaegerman, Scott Deifik and Hermann Peifer for general portability testing. The usual GNU build incantation should be used: tar -xpvzf gawk-5.4.0.tar.gz cd gawk-5.4.0 ./configure && make && make check Please use the gawkbug script to report bugs. If it doesn't work for you, then send email to [email protected]. NOTE that the manual's instructions for sending bug reports were updated again. Please review them carefully before submitting a report! ONLY bug reports should be submitted to the bug-gawk list. All other questions should use the [email protected] mailing list. Enjoy! Arnold Robbins (on behalf of all the gawk developers) [email protected] ------------------------------------------------------------ Copyright (C) 2019, 2020, 2021, 2022, 2023, 2024, 2025, 2026 Free Software Foundation, Inc. Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. Changes from 5.3.x to 5.4.0 --------------------------- 1. This release now uses Mike Haertel's MinRX regular expression matcher as the default regexp engine. The old regex and dfa engines are still available. More detail is available in the manual, and in the file README_d/README.matchers. At the very least, read that file! 2. The manual, in the Bugs section, now makes it explicit that (a) Ad hominem attacks on the lists will not be tolerated, and (b) Discussion of proprietary software is strongly discouraged. Repeated offenses are grounds for being banned from the lists. 3. There is now a new directive, @nsinclude, which works like @include but does not reset the namespace for the included file to "awk". See the manual for details. 4. When using lshift() or rshift() and attempting to shift by as many or more bits than in a uintmax_t, gawk returns zero, instead of whatever the C compiler and hardware might have done. 5. Gawk's use of persistent memory has changed somewhat: A. Gawk now stores additional meta-information in the backing file. This means that if you have a backing file with important data in it, you should dump the data to a text file using the old version, create a new backing file, and then read your data back in with the new version, to a *brand new* backing file. B. Gawk generates a warning if the version of gawk saved in the backing file doesn't match that of the current running gawk. C. It's now possible to use persistent memory and dynamic extensions without problems. Gawk notices if an extension is being loaded from a different path than what was first used and produces a fatal error in this case. 6. The ordchr extension now supports multibyte / wide characters. 7. Per the 2024 POSIX standard, `length(array)' is no longer an extension, but a regular feature. Thus --posix no longer rejects it and --lint no longer warns about it. 8. The --traditional option has been rationalized to bring gawk into sync with BWK awk. It no longer affects the return code from system(), and it no longer prevents using a regexp for RS. Internally, the code was cleaned up some as well. 9. Assertions in the C code are now enabled. To disable them, manually edit the various Makefiles after running configure and before running make. You will need to add -DNDEBUG to the CFLAGS variable. 10. PMA should now work on OpenBSD 7, FreeBSD 12 - 16, NetBSD 10 and 11, and MidnightBSD 3 and 4. 11. Hexadecimal floating-point values may now be used in program source code, with strtonum(), and with the -n/--non-decimal-data option. See the manual for details. 12. A large number of small "replacement" files for standard functions have been removed. These functions are now so standard that we simply expect them to always be available. This simplifies the distribution and the code maintenance. 13. Support for UDP in gawk's networking support is now obsolete. It never worked very well. It will be removed in version 6.0. Gawk issues a warning when attempting to use it. 14. Reading regular disk input files should be somewhat faster now, since gawk no longer checks for timeouts on such files. On one very large file, gawk '{ print }' saw approximately a 9% speedup. 15. The MinGW port of gawk for MS-Windows now supports UTF-8 encoded non-ASCII text when the console window where gawk runs uses the Windows codepage 65001 for output, even if the system-wide locale specifies another codepage. Similarly, the Cygwin port now also fully supports UTF-8. 16. There is a new option to configure: --enable-O3. This causes gcc to use -O3 instead of -O2 when compiling gawk. This is not the default because experience in some projects has shown (sadly) that -O3 can cause bugs. 17. There is a new translation: Arabic. The .gmo files for the ca, da, fi, ja, ka, ms, and vi translations are no longer built or included in the distribution, as those translations have gone too long without being updated. The .po files remain in the distribution, should any volunteers wish to come forward to update them. 18. OpenVMS support has been updated. This release builds on Alpha, Itanium and x86_64. 19. As usual, a number of small bugs have been fixed; see the ChangeLog for the details. Changes from 5.3.2 to 5.3.x --------------------------- 1. The Hebrew translation has been revived. 2. All non-standard variables are now not installed for --traditional and --posix. 3. It's been discovered that persistent memory and dynamic extensions don't mix. For now, trying this combination produces a fatal error. It may one day get fixed. Or, it may not. 4. A bug in the API has been fixed whereby using a numeric index to set an array element will work. As a result, the API minor version was increased to 1. ------------------ README.matchers -------------------------------- Wed Feb 18 11:39:29 AM IST 2026 =============================== * I * M * P * O * R * T * A * N * T * This release includes a new regular expression matcher, MinRX, written by Mike Haertel, the original author of GNU grep. It's available from https://github.com/mikehaertel/minrx. This matcher is fully POSIX compliant, which the current GNU matchers are not. In particular it follows POSIX rules for finding the longest leftmost submatches. It is also more strict as to regular expression syntax, but primarily in a few corner cases that normal, correct, regular expression usage should not encounter. Because regular expression matching is such a fundamental part of awk/gawk, the original GNU matchers are still included in gawk. In order to use them, give a value to the GAWK_GNU_MATCHERS environment variable before invoking gawk. If you find a difference in behavior between the new and original matchers, please report it. In particular if it adversely affects your current application(s). Note that if the difference is due to being fully POSIX compliant, then you should consider revising your application. Please use the gawkbug script to report any issues, as would be done for any other bug. See node Bugs in the manual for more details; it's online at https://www.gnu.org/software/gawk/manual/html_node/Bugs.html. PLEASE NOTE! The original GNU matchers will eventually be removed from gawk. So, please take the time to notice and report any issues in the MinRX matcher, so that they can be ironed out sooner rather than later. Thanks! Known Issues ============ When ignoring case, in locales where more than one lower-case character maps to the same upper-case character, MinRX does not currently do the right thing. This is being worked on and should be fixed in the first patch.
