Package: grep
Tags: patch

The patch avoids to add same character to a bracket expression in
trivial_case_ignore.  That may be able to generate smaller tokens in
multibyte locales.

For example, FULLWIDTH LATIN CAPITAL LETTER A (ef bd 81) will transform
as below, because multibyte characters in CSET is extended to OR
expressions in DFA.

Before the patch:

[AAa] (where each charactecter is fullwidth)
EF BD CAT 81 CAT EF BD CAT 81 CAT OR EF BC CAT A1 CAT OR

After the patch:

[Aa] (where each charactecter is fullwidth)
EF BD CAT 81 CAT EF BC CAT A1 CAT OR

Attachment: patch.txt
Description: Binary data

Reply via email to