Also, I don't see why grep needs to copy the buffer when there's an encoding error. Why not simply rerun the matcher on the initial prefix that doesn't have an encoding-error byte, and then (if that doesn't find a match), try matching the suffix after the encoding-error byte? This approach would not only avoid the buffer copy, it would avoid knowledge of libpcre internals.
Thanks, but that patch seems to depend on libpcre internals, in that it
"knows" that pcre_exec cannot possibly succeed without first checking
its entire input buffer for invalid UTF-8 bytes. Even if that's true
now, it reflects a performance bug that might be fixed in a future
libpcre version.
- bug#18266: Bug#758105: bug#18266: grep -P and invalid exit... Vincent Lefevre
- bug#18266: Bug#758105: bug#18266: grep -P and invalid... Paul Eggert
- bug#18266: Bug#758105: bug#18266: grep -P and inv... Vincent Lefevre
- bug#18266: Bug#758105: bug#18266: grep -P and... Paul Eggert
- bug#18266: Bug#758105: bug#18266: Bug#758105:... Santiago
- bug#18266: Bug#758105: bug#18266: Bug#758105:... Vincent Lefevre
- bug#18266: Bug#758105: bug#18266: Bug#758105:... Santiago
- bug#18266: Bug#758105: bug#18266: Bug#758105:... Paul Eggert
- bug#18266: grep -P and invalid exits with err... Santiago
- bug#18266: grep -P and invalid exits with err... Eric Blake
- bug#18266: grep -P and invalid exits with err... Paul Eggert
- bug#18266: grep -P and invalid exits with err... Vincent Lefevre
- bug#18266: grep -P and invalid exits with err... Paul Eggert
- bug#18266: grep -P and invalid exits with err... Santiago
- bug#18266: grep -P and invalid exits with err... Norihiro Tanaka
- bug#18266: grep -P and invalid exits with err... Paul Eggert
- bug#18266: grep -P and invalid exits with err... Norihiro Tanaka
- bug#18266: grep -P and invalid exits with err... Paul Eggert
- bug#18266: grep -P and invalid exits with err... Paul Eggert
- bug#18266: Bug#758105: bug#18266: grep -P and... Santiago
- bug#18266: Bug#758105: bug#18266: grep -P and... Vincent Lefevre