In article <[EMAIL PROTECTED]>, Stefan Monnier <[EMAIL PROTECTED]> writes:
>> Is it reasonable to operate with decode-coding-string on a multibyte >> string? If that is nonsense, maybe we should make it get an error, >> to help people debug such problems. > I think it would indeed make sense to signal errors when decoding > a multibyte string or when encoding a unibyte string. >> If there are some few cases where decode-coding-string makes sense on >> a multibyte string, maybe we can make it get an error except in those >> few cases. > The problem I suspect is that it's pretty common for ASCII-only strings to > be arbitrarily marked unibyte or multibyte depending on the circumstance. > So we would have to check for the case where the string is ASCII-only before > signalling an error. > I'm actually running right now with an Emacs that does signal such errors. > I've changed the notion of "multibyte/unibyte" string by saying: > - [same as now] if size_byte < 0, it's UNIBYTE. > - [same as now] if size_byte > size, it's MULTIBYTE. > - [changed] if size_byte == size, it's neither/both (ASCII-only). > Then I've changed several parts of the C code to try and set size_byte==size > whenever possible (instead of marking the string as unibyte). Even if size_byte == size, it may contain eight-bit-graphic characters, and decoding such a string is a valid operation. And even if size_byte > size, it may contain only ASCII, eight-bit-graphic, and eight-bit-control charactes. It's also a valid operation to decode it. It's not a trivial work to change the current code (in coding.c) to signal an error safely while doing a code conversion. So, to check if decoding is valid or not, we have to check all characters in a string in advance, which, I think, slows down the operation considerably. --- Ken'ichi HANDA [EMAIL PROTECTED] _______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel