Thank you for the described workaround, I'll try it on real files.

> Scintilla editing widget claims to handle illegal bytes and show them as 
> lozenge shapes with the hex in them

And that is exactly what is needed here and sometimes already happens with 
Geany's encoding detection. I only think there could be a manual way in Geany 
to specify "display what you can in this encoding, otherwise print illegal 
symbol".

> any code point 128 or greater is encoded as a sequence of _more than one_ 
> byte with a value >= 128

I did not know that more than one byte was required in every case above 127, 
thank you for outlining it. But an UTF-8 buffer shoud be able to store values 
128-255 with some hackery. Which might not be needed at all given the 
Scintilla-level solution above.

> encoding "detection" is "search for an encoding that will convert the file to 
> UTF-8"

Maybe that could be better emphasized in the program somehow. Also _Without 
encoding_ is still a misleading nomenclature. Wouldn't _Auto-search_ or _First 
found_, while still generic, be a more appropriate name?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/geany/geany/issues/2910#issuecomment-933779249

Reply via email to