Hi David,

>>>>> David Kastrup <d...@gnu.org> writes:
>> I changed it to "[\x00-\xFF]+" to process all the raw 8-bit bytes
>> together at decoding with the relavant coding system.

> That does not cover raw bytes since they are not in the range 00-ff in
> Emacs multibyte characters.  So that expression would only work for
> bytes in buffers decoded from files considered to be in Latin-1
> encoding.

If my memory serves, that's the behavior of non-unicode emacs
(mule-version < 6).  The current emacs (mule-version = 6) actually has a
multibyte treatment smart (or confusing) enough to match raw 8-bit byte
with regexp "[\x00-\xFF]".  The both form
(string-match "[\x00-\xFF]" (string-to-unibyte (byte-to-string #xab)))
(string-match "[\x00-\xFF]" (string-to-multibyte (byte-to-string #xab)))
returns non-nil value (0), at least on my emacs 26.1.

Although it is true that raw 8-bit characters in multibyte string are
not in the range 00-ff, the current emacs automatically (and implicitly)
converts them into 00-ff when matching against such regexps.  Whereas
the form
(aref (string-to-multibyte (byte-to-string #xab)) 0)
returns #x3fffab, the string matches with "[\x00-\xFF]" in
`string-match'.  (I admit that this behavior is confusing.)

Regards,
Ikumi Keita

_______________________________________________
auctex-devel mailing list
auctex-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/auctex-devel

Reply via email to