On Wed, Dec 18, 2013 at 8:53 AM, Santiago <[email protected]> wrote:
...
> $ src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
> src/grep: invalid UTF-8 byte sequence in input
>
> When I'd expected something like:
>
> $ LC_ALL=C src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc: if ($prototype =~
> m/DEFINE_SINGLE_EVENT\((.*?),/) {
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc: if ($prototype =~
> m/DEFINE_EVENT\((.*?),(.*?),/) {
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc:## if ($prototype =~
> m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) {
> /usr/lib/linux-kbuild-3.2/scripts/kernel-doc: if ($prototype =~
> m/SYSCALL_DEFINE0/) {
> ...
>
> Maybe, it is a pcre (v. 8.31) issue.
Hi Santiago,
Thanks for testing that.
What do you get when you run the stand-alone example I gave in the
commit log and in the test?
printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?
For me (using pcre-8.33), it works the way I want and both matches:
jM-^B$
j$
0
Hmm... I see that with debian unstable's 8.31-2, it does indeed act differently.
I may have to think about excluding pcre support when the version
doesn't work the way I want.