On 2025-05-31, rsyk...@disroot.org <rsyk...@disroot.org> wrote: > Dear list, > > > I was surprised to learn that 'grep -i' does not > really work for accented letters > > odin:~$ cat a > křížala > kŘíŽala > odin:~$ grep -i ž a > křížala > odin:~$ grep -i Ž a > kŘíŽala > > As I had LC_COLLATE="C", I tried also with this > set to en_US.UTF-8, but to no avail. > > Does grep -i only work for ascii letters?
yes, that's expected. OpenBSD base doesn't support LC_COLLATE. $ man -k ANY=LC_COLLATE locale(1) - character encoding and localization conventions glob, globfree(3) - generate pathnames matching a pattern setlocale(3) - select character encoding strcoll, strcoll_l(3) - compare strings according to current collation strxfrm, strxfrm_l(3) - transform a string under locale wcscoll, wcscoll_l(3) - compare wide strings according to the current collation wcsxfrm, wcsxfrm_l(3) - transform a wide string under locale $ man locale LOCALE(1) General Commands Manual LOCALE(1) NAME locale – character encoding and localization conventions SYNOPSIS locale [-a | -m | charmap] [...] A locale is a set of environment variables telling programs which character encoding, language and cultural conventions the user prefers. Programs in the OpenBSD base system ignore the locale except for the character encoding, and it is not recommended to use any of these variables except that the following non-default setting is supported as an option: export LC_CTYPE=en_US.UTF-8 Programs installed from packages(7) may or may not change behavior according to the locale. Many programs use the X/Open System Interfaces naming scheme for the contents of the variables listed below, which is language[_TERRITORY][.encoding][@modifier] [...] > Is there a general way to achive 'true' case > insensitive match (other than list all possibly > present accented letters in both forms, i.e., > as [žŽ] in my case? ggrep does in this instance, but I don't know how reliable that is. -- Please keep replies on the mailing list.