Vijay Marupudi schreef op vr 21-01-2022 om 15:20 [-0500]: + (pass-if-exception "utf8->string range: end < start" + exception:out-of-range + (let* ((utf8 (string->utf8 "gnu guile"))) + (utf8->string utf8 1 0))) + [other tests]
It would be nice to check multibyte characters as well, to verify that byte indices and not character indices are used. E.g., (utf8->string #vu8(195 169) 0 2) should return "é". Another nice test: (utf8->string #vu8(195 169) 0 1) should raise a 'decoding-error', even though #vu8(195 169) is valid UTF-8. And (utf8->string #vu8(0 32 196) 0 2) should return "\x00 " even though #vu8(0 32 195) is invalid UTF-8 -- and as a bonus, it checks that the nul character is supported -- which can be easily forgotten because Guile is implemented in C which usually terminates strings by zero instead of using a length field. Overall, the patch you sent seems a reasonable approach to me, though I didn't verify the details. I find myself at times copying a part of a bytevector to a new bytevector because some procedure doesn't allow specifying byte ranges ... Greetings, Maxime
signature.asc
Description: This is a digitally signed message part