i no much here the topic .. just for short .. i found uconv of icu-devtools has more opts has also some transliteration opt just that u may not know it i no pro i still cant achieve what i had to do
On Tue, Jul 18, 2023, 12:13 AM Grisha Levit <grishale...@gmail.com> wrote: > On Mon, Jul 17, 2023 at 3:29 PM Chet Ramey <chet.ra...@case.edu> wrote: > > > > On 7/7/23 5:05 PM, Grisha Levit wrote: > > > A few small tweaks for the macOS-specific normalization handling to > > > handle the issues below: > > > > The issue is that the behavior has to be different between cases where > > the shell is reading input from the terminal and gets NFC characters > > that need to be converted to NFD (which is how HFS+ and APFS store them) > > and when the shell is reading input from a file and doesn't need to (and > > should not) do anything with NFD characters. > > NB: while HFS+ stores NFD names, APFS preserves normalization, so we > can get either NFC or NFD text back from readdir. Both are > normalization-insensitive: "Being normalization-insensitive ensures > that normalization variants of a filename cannot be created in the > same directory, and that a filename can be found with any of its > normalization variants." [1] > > Currently, Bash never actually converts to NFD. The fnx_tofs() > function is there but it is never used. Instead, Bash converts > filenames to NFC with fnx_fromfs() before comparing with either the > glob pattern or the completion hint text (which is never converted). > > Since access is normalization-insensitive, we just need to normalize > to _some_ form, so going to NFC is fine, but if we're going to do that > we should normalize both the filesystem name and the text being > compared. > > If there's a match, globs expand to the filenames (NFC or NFD) as > returned by readdir(), and Readline completes with NFC-normalized > versions of the names. I think this makes sense. > > What doesn't work quite right currently though is that glob patterns > with NFD text never match anything, and completion prefixes with NFD > text never expand to anything. > > [1]: > https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/FAQ/FAQ.html > > > Does iconv work when taking NFD input that came from the file system and > > trying to convert it to NFD (UTF-8-MAC)? I've honestly never checked. > > Converting to UTF-8-MAC always normalizes to NFD: > > $ printf '\303\251\0\145\314\201' | iconv -f UTF-8-MAC -t UTF-8-MAC | od > -b -An > 145 314 201 000 145 314 201 > > $ printf '\303\251\0\145\314\201' | iconv -f UTF-8 -t UTF-8-MAC | od > -b -An > 145 314 201 000 145 314 201 > > But Bash only converts from UTF-8-MAC to UTF-8, which always normalizes to > NFC: > > $ printf '\303\251\0\145\314\201' | iconv -f UTF-8-MAC -t UTF-8 | od > -b -An > 303 251 000 303 251 > >