Hi.

I was trying to grep logs for some mail log entries and spammer used 0xF3 byte to try to hide / trick things. For grep it looks like this:

$ printf 'a\xF3bcdefgh' > x2

$ LC_ALL=C.UTF-8 grep 'a.*h' x2
$

$ LC_ALL=C grep 'a.*h' x2
abcdefgh

$ LC_ALL=C.UTF-8 grep -a 'a.*h' x2
$

[arekm@ixion ~]$ LC_ALL=C grep -a 'a.*h' x2
abcdefgh


Is that expected behavior, no binary file warning and no matching with utf-8 locale, even with -a? AFAIK that's not correct utf-8 sequence.


$ grep --version x2
grep (GNU grep) 3.12
Copyright (C) 2025 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others; see
<https://git.savannah.gnu.org/cgit/grep.git/tree/AUTHORS>.

grep -P uses PCRE2 10.45 2025-02-05
--
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )




  • bug#78276: grep on file... Arkadiusz Miśkiewicz via Bug reports for GNU grep

Reply via email to