On 9/15/18 12:57 PM, 21na...@gmail.com wrote:

But is it at least possible to find “\x0A\x00” with grep?

If you bend the rules by throwing -P into the mix, yes :)

So it is possible to find “\x0A\x00” alone, but for example “\x74\x00\x0D\x00\x0A\x00\x74\x00\x65\00” is impossible to find with the “-P” option?

Correct. It is impossible to find the record terminator in the middle of a pattern, whether that terminator is \n (default) or NUL (-z). It is therefore impossible to find a multi-record match using grep. The string you listed contains both \x00 and \x0a, so regardless of which of those two bytes you pick as the record terminator, it is impossible to use grep to find that substring in your file. You'll have to resort to a tool that supports multiline matching, since grep is not such a tool.

It IS possible, of course, to change your data, for example:

tr '\0' '\xff' < file | grep $modified_pattern | tr '\xff' '\0'

assuming that \xff didn't appear anywhere else in the file; although it may make matching harder if you don't have the right record terminators any longer. Or, if your input data is encoded in UTF-16, it's easiest to convert it into UTF-8 for the grep:

iconv -f UTF-16 -t UTF-8 < file | grep $modified_pattern \
  | iconv -f UTF-8 -t UTF-16

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Reply via email to