Am Sun, 09 Feb 2014 12:18:41 +0400 schrieb Dmitry Olshansky <[email protected]>:
> 09-Feb-2014 09:35, Marco Leise пишет: > > Thats neither an improvement over calling "validate" nor does > > that deal with distinguishing between invalid UTF and > > Means text is broken but wasn't ever read... > >\uFFFD > > in the input. > ...means text was broken sometime before. > > Hardly makes any difference to the most applications. > Normal text doesn't contain \uFFFD. Of course it does. It is a valid symbol and a lot of websites describing the "Specials" Unicode block make use of it, like the one on Wikipedia: http://en.wikipedia.org/wiki/Specials_(Unicode_block) With your definition, pulling such a document from the web and parsing it in D would mean playing on broken strings. > >> [...] > >> Every single text editor out there seems to disagree with you: they do > >> show you partially substituted text, not a dialog box "My bad, it's > >> broken UTF-8, I'm giving up!". > > gedit does in fact throw an error message at you > > saying "My bad, it's broken UTF-8, I'm giving up!". > I know and it's piece of junk :) > Seriously it doesn't even has regular expressions for search and replace! https://yourlogicalfallacyis.com/no-true-scotsman :p -- Marco
