metsw24-max opened a new pull request, #687:
URL: https://github.com/apache/logging-log4cxx/pull/687
Fix undefined behavior and locale-dependent output in
`StringHelper::toLowerCase`.
The previous implementation passed `LogString` characters directly to
`::tolower(int)` via `std::transform`. This violates the C standard requirement
that the argument must either be `EOF` or representable as `unsigned char`.
When `logchar = char` (commonly signed), any byte greater than `0x7F`
sign-extends to a negative `int`, triggering undefined behavior. The behavior
also depended on the active `LC_CTYPE` locale, causing the same configuration
file to produce different lowercased values on different systems.
This patch replaces the locale-sensitive transformation with deterministic
ASCII-only folding (`A-Z -> a-z`) while preserving all non-ASCII bytes
unchanged.
## Changes
### `src/main/cpp/stringhelper.cpp`
* Replaced:
* `std::transform(..., tolower)`
* With:
* deterministic ASCII-only lowercase conversion
* Eliminates UB from invalid `tolower` inputs
* Removes locale-dependent behavior
* Preserves non-ASCII bytes unchanged
### `src/test/cpp/helpers/stringhelpertestcase.cpp`
Added regression coverage:
* `testToLowerCaseAscii`
* verifies normal ASCII lowercase conversion
* `testToLowerCaseNonAsciiPassesThrough`
* verifies non-ASCII bytes remain unchanged
* validates locale-independent behavior
## Reproducer
With the original implementation:
* `testToLowerCaseNonAsciiPassesThrough` fails on systems using locales such
as `English_India.1252`
* Example:
* `0xC9` (`É`) may be transformed into `0xE9` (`é`) through
locale-sensitive `tolower`
With this patch:
* all `stringhelpertestcase` tests pass
* `patternparsertestcase` also passes unchanged
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]