Re: decode-coding-string gone awry?

Kenichi Handa Thu, 17 Feb 2005 04:47:16 -0800

In article <[EMAIL PROTECTED]>, Stefan Monnier <[EMAIL PROTECTED]> writes:


>>  Is it reasonable to operate with decode-coding-string on a multibyte
>>  string?  If that is nonsense, maybe we should make it get an error,
>>  to help people debug such problems.

> I think it would indeed make sense to signal errors when decoding
> a multibyte string or when encoding a unibyte string.

>>  If there are some few cases where decode-coding-string makes sense on
>>  a multibyte string, maybe we can make it get an error except in those
>>  few cases.

> The problem I suspect is that it's pretty common for ASCII-only strings to
> be arbitrarily marked unibyte or multibyte depending on the circumstance.
> So we would have to check for the case where the string is ASCII-only before
> signalling an error.

> I'm actually running right now with an Emacs that does signal such errors.
> I've changed the notion of "multibyte/unibyte" string by saying:
> - [same as now] if size_byte < 0, it's UNIBYTE.
> - [same as now] if size_byte > size, it's MULTIBYTE.
> - [changed]     if size_byte == size, it's neither/both (ASCII-only).

> Then I've changed several parts of the C code to try and set size_byte==size
> whenever possible (instead of marking the string as unibyte).

Even if size_byte == size, it may contain eight-bit-graphic
characters, and decoding such a string is a valid operation.
And even if size_byte > size, it may contain only ASCII,
eight-bit-graphic, and eight-bit-control charactes.  It's
also a valid operation to decode it.

It's not a trivial work to change the current code (in
coding.c) to signal an error safely while doing a code
conversion.  So, to check if decoding is valid or not, we
have to check all characters in a string in advance, which,
I think, slows down the operation considerably.

---
Ken'ichi HANDA
[EMAIL PROTECTED]


_______________________________________________
Emacs-devel mailing list
Emacs-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-devel

Re: decode-coding-string gone awry?

Reply via email to