On 11/22/23 7:46 PM, Grisha Levit wrote:
Many of the tests in unicode.sub don't actually run because the arrays containing codepoints to test are sparse and the TestCodePage function assumes that they are not.
Thanks for the report. Nice attention to detail. This test has not changed substantially since it was donated in 2012. Namerefs didn't exist then, and my guess is that John Kearney wasn't familiar or comfortable with ${!x[@]}. https://lists.gnu.org/archive/html/bug-bash/2012-02/msg00063.html
If that's fixed, the zh_TW.BIG5 tests run but fail. I'm not sure what the original intent was, they seem to expect U+00F6..U+00FE to be encoded as 0xF6..0xFE which is not the case.
You'd have to ask John, I guess. These tests never got run in any case, since the original code, as you pointed out, assumed the array wasn't sparse, and the discrepancy never got discovered. My guess is the point is to check how codepoints that don't encode valid characters in the target character set (though 0xf7 is valid) are displayed, but you can't be sure. In any case, they're just wrong. This was part of a much wider discussion about unicode character conversion in bash-4.2, which you might find interesting as a twelve-year-old discussion. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/