2015-05-15 17:45 GMT+02:00 Richard Wordingham < [email protected]>:
> I think this discussion on search and replace would benefit from some > examples. I don’t see your problem. Is it based on experience? I have > some fairly simple examples. > Just consider a regexp that attempts to search and subtitute "é" (for example by "É") and that has to locate it where it is in NFC form (single character) or NFD form (combining sequence). You'll also have to match cases where there are other intermediate combining characters (with a distinct non-zero combining class, different from the combining class of the acute accent) between the base letter and the acute accent. You have then to return discontiguous matches, but your replacement string "É" should still preserve the other combining characters. The situation is even worse if you are looking for strings in which you want to discard only some combining characters (the replacement is empty): there may be several discontiguities in the matches. Now imagine that the replacement string is to replace all these distinct combining characters by a single one (such things would be done for filters that want to eliminate some combining characters not suitable for a given language, or because there's a linguistic orthographic rule that permits these substitutions of foreign combining characters, e.g. : drop combining dots above, replace all combining characters below, except the cedilla by a single one such as a low line. Such thing would also happen for languages that have changed/simplified their orthography about combiing characters, or that use two distinct orthographic conventions and you want to convert between them)

