Fix off-by-one with NFC recomposition for Hangul U+11A7 (TBASE) The NFC recomposition incorrectly included TBASE as a valid T syllable, which is incorrect based on the Unicode specification (TBASE is one below the start of the range, range beginning at U+11A8).
This would cause the TBASE to be silently swallowed in the normalization, leading to an incorrect result. A couple of regression tests are added to check more patterns with Hangul recomposition and decomposition, on top of a test to check the problem with TBASE. Diego has submitted the code fix, and I have written the tests. Author: Diego Frias <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14 Branch ------ REL_18_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/273fe94852b3a7e34fd171e8abdf1481beb302fa Modified Files -------------- src/common/unicode_norm.c | 2 +- src/test/regress/expected/unicode.out | 78 +++++++++++++++++++++++++++++++++++ src/test/regress/sql/unicode.sql | 20 +++++++++ 3 files changed, 99 insertions(+), 1 deletion(-)
