Re: Emacs puts binary junk into the clipboard, marking it as text

Jan D. Tue, 19 Sep 2006 12:33:41 -0700

Stefan Monnier skrev:
>>> Also IIRC a perfectly valid utf-8 buffer may contain eight-bit-* chars, use
>>> to keep track of valid unicode chars that have no corresponding character in
>>> emacs-mule.  So the presence of eight-bit-* chars does not imply that the
>>> utf-8 encoded form of the text will contain an invalid utf-8 byte sequence.
>>>       
>
>   
>> Yes, but such eight-bit-* chars can be detected by checking
>> `untranslated-utf-8' property.
>>     
>
> Sure, but the current code doesn't do that.
>
>   
>>>> And, if Emacs owns a unibyte string, perhaps the right thing
>>>> is to make it multibyte according to the current
>>>> lang. env. (by string-make-multibyte) at first, then encode
>>>> it by utf-8.
>>>>         
>
>   
>>> That sounds terribly fragile/buggy.
>>>       
>
>   
>> Then, what do you think Emacs should do in such a case?
>>     
>
> I think we can't know what should be done, so we should strive for
> simplicity and try to avoid losing information.  I.e. just return the
> unibyte string as-is.
>


That was the problem the original report was about.  Gtk+-applications
print big warnings.  And there is no agreed upon selection type that
represents just  bytes.

W.r.t the standards, Emacs has two choices, return a valid UTF8-string
or don't return anything at all.  I'm beginning to think the second
option is the best.

    Jan D.



_______________________________________________
emacs-pretest-bug mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug

Re: Emacs puts binary junk into the clipboard, marking it as text

Reply via email to