Hello,

On 15/09/18 11:57 AM, 21na...@gmail.com wrote:
Le 15/09/2018 à 19:06, Eric Blake a écrit :
On 9/15/18 11:43 AM, 21na...@gmail.com wrote:
But is it at least possible to find “\x0A\x00” with grep?

If you bend the rules by throwing -P into the mix, yes :)

So it is possible to find “\x0A\x00” alone, but for example “\x74\x00\x0D\x00\x0A\x00\x74\x00\x65\00” is impossible to find with the “-P” option?

If I may suggest a different tool, GNU sed can handle such regexes more easily than grep.
The 'trick' is to accumulate multiple lines into memory, then run the
regex on the entire buffer.

1.
If you input is small enough to fit in memory,
you can load the entire file into memory,
and run the regex on the buffer:

$ printf '\xFF\xFE\x0D\x00\x0A\x00\x74\x00\x65\x00\x73\x00\x74\x00\x0D\x00\x0A\x00\x74\x00\x65\x00\x73\x00\x74\x00\x5F\x00\x74\x00\x77\x00\x6F\x00\x0D\x00\x0A\x00' \
     | LC_ALL=C sed -n 'H;$!d ; x ; /\x0a\x00/q0 ; q1' \
           && echo MATCH || echo NO-MATCH

The "H;$!d" commands accumulate lines into the hold buffer.
The "x" command copies the hold buffer into the pattern buffer.
Then the regex "/\x0a\x00/" searches in the buffer.
If there was a match, sed quits with exit code 0 ("q0").
Otherwise, sed quits with exit code 1 ("q1").


2.
If the file is too big to fit in memory,
you can process it line-by-line like so:

$ printf '\xFF\xFE\x0D\x00\x0A\x00\x74\x00\x65\x00\x73\x00\x74\x00\x0D\x00\x0A\x00\x74\x00\x65\x00\x73\x00\x74\x00\x5F\x00\x74\x00\x77\x00\x6F\x00\x0D\x00\x0A\x00' \
     | LC_ALL=C sed -n 'N;/\x00\x0a/q0;$q1;D;' \
             && echo MATCH || echo NO-MATCH

The N,D commands work in tandem to append the next line into the
buffer, then delete the last line from the buffer (think FIFO).
The regex then operates on the buffer which contains the last two lines.



More details are in the manual:
 https://www.gnu.org/software/sed/manual/sed.html#Multiline-techniques
https://www.gnu.org/software/sed/manual/sed.html#Text-search-across-multiple-lines



regards,
 - assaf




Reply via email to