It's good that the masking with 0x1fffff now only occurs if PCRE_NO_UTF32_CHECK 
is specified. The Unicode conformance can be improved, and the code made 
slightly smaller, faster, and more flexible, with a simple change to 
pcre_internal.h. By default, PCRE_NO_UTF32_CHECK should disable checking 
without enabling masking. Masking can be enabled by a compile-time option. The 
definition of UTF32_MASK can be replaced by the following:

#if defined PCRE_MASK_UTF32_BEYOND_1FFFFF
#define ADJUST_UTF32_CODE_UNIT(c) ((c) & 0x1fffffu)
#else
#define ADJUST_UTF32_CODE_UNIT(c) (c)
#endif

and these macros can be revised as follows:

#define GETCHAR(c, eptr) \
 c = ADJUST_UTF32_CODE_UNIT(*(eptr));

#define GETCHARTEST(c, eptr) \
 c = *eptr; \
 if (utf) c = ADJUST_UTF32_CODE_UNIT(c);

#define GETCHARINC(c, eptr) \
 c = ADJUST_UTF32_CODE_UNIT(*eptr++);

#define GETCHARINCTEST(c, eptr) \
 c = *eptr++; \
 if (utf) c = ADJUST_UTF32_CODE_UNIT(c);

#define RAWUCHAR(eptr) \
 ADJUST_UTF32_CODE_UNIT(*(eptr))

#define RAWUCHARINC(eptr) \
 ADJUST_UTF32_CODE_UNIT(*(eptr)++)

#define RAWUCHARTEST(eptr) \
 (utf ? (ADJUST_UTF32_CODE_UNIT(*(eptr))) : *(eptr))

#define RAWUCHARINCTEST(eptr) \
 (utf ? (ADJUST_UTF32_CODE_UNIT(*(eptr)++)) : *(eptr)++)

Best wishes,

Tom

文林 Wenlin Institute, Inc.        Software for Learning Chinese
E-mail: [email protected]     Web: http://www.wenlin.com
Telephone: 1-877-4-WENLIN (1-877-493-6546)
☯




-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to