Replace pg_mblen() with bounds-checked versions.

A corrupted string could cause code that iterates with pg_mblen() to
overrun its buffer.  Fix, by converting all callers to one of the
following:

1. Callers with a null-terminated string now use pg_mblen_cstr(), which
raises an "illegal byte sequence" error if it finds a terminator in the
middle of the sequence.

2. Callers with a length or end pointer now use either
pg_mblen_with_len() or pg_mblen_range(), for the same effect, depending
on which of the two seems more convenient at each site.

3. A small number of cases pre-validate a string, and can use
pg_mblen_unbounded().

The traditional pg_mblen() function and COPYCHAR macro still exist for
backward compatibility, but are no longer used by core code and are
hereby deprecated.  The same applies to the t_isXXX() functions.

Security: CVE-2026-2006
Backpatch-through: 14
Co-authored-by: Thomas Munro <[email protected]>
Co-authored-by: Noah Misch <[email protected]>
Reviewed-by: Heikki Linnakangas <[email protected]>
Reported-by: Paul Gerste (as part of zeroday.cloud)
Reported-by: Moritz Sanft (as part of zeroday.cloud)

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1e7fe06c10c0a8da9dd6261a6be8d405dc17c728

Modified Files
--------------
contrib/btree_gist/btree_utils_var.c     |  21 +++--
contrib/dict_xsyn/dict_xsyn.c            |   4 +-
contrib/hstore/hstore_io.c               |   2 +-
contrib/ltree/crc32.c                    |   3 +-
contrib/ltree/lquery_op.c                |   4 +-
contrib/ltree/ltree.h                    |   2 +-
contrib/ltree/ltree_io.c                 |   8 +-
contrib/ltree/ltxtquery_io.c             |   2 +-
contrib/pageinspect/heapfuncs.c          |   2 +-
contrib/pg_trgm/trgm.h                   |   2 +-
contrib/pg_trgm/trgm_op.c                |  52 +++++++----
contrib/pg_trgm/trgm_regexp.c            |  23 ++---
contrib/pgcrypto/crypt-sha.c             |   2 +-
contrib/unaccent/unaccent.c              |   5 +-
src/backend/catalog/pg_proc.c            |   2 +-
src/backend/tsearch/dict_synonym.c       |   4 +-
src/backend/tsearch/dict_thesaurus.c     |   8 +-
src/backend/tsearch/regis.c              |  37 ++++----
src/backend/tsearch/spell.c              |  81 ++++++++---------
src/backend/tsearch/ts_locale.c          |  56 +++++++-----
src/backend/tsearch/ts_utils.c           |   2 +-
src/backend/tsearch/wparser_def.c        |   3 +-
src/backend/utils/adt/encode.c           |   6 +-
src/backend/utils/adt/formatting.c       |  22 ++---
src/backend/utils/adt/jsonfuncs.c        |   2 +-
src/backend/utils/adt/jsonpath_gram.y    |   3 +-
src/backend/utils/adt/levenshtein.c      |  14 +--
src/backend/utils/adt/like.c             |  18 ++--
src/backend/utils/adt/like_match.c       |   3 +-
src/backend/utils/adt/oracle_compat.c    |  33 ++++---
src/backend/utils/adt/regexp.c           |   9 +-
src/backend/utils/adt/tsquery.c          |  13 ++-
src/backend/utils/adt/tsvector.c         |  11 +--
src/backend/utils/adt/tsvector_op.c      |  10 ++-
src/backend/utils/adt/tsvector_parser.c  |  19 ++--
src/backend/utils/adt/varbit.c           |   8 +-
src/backend/utils/adt/varlena.c          |  38 +++++---
src/backend/utils/adt/xml.c              |  11 ++-
src/backend/utils/mb/mbutils.c           | 150 +++++++++++++++++++++++++++++--
src/include/mb/pg_wchar.h                |   7 ++
src/include/tsearch/ts_locale.h          |  30 ++++++-
src/include/tsearch/ts_utils.h           |  14 ++-
src/test/modules/test_regex/test_regex.c |   3 +-
43 files changed, 485 insertions(+), 264 deletions(-)

Reply via email to