Jim Meyering wrote: > The fact that a Turkish I-with-dot (U+0130) on a matched line of input > can make "grep -i" generate corrupt output in nearly any UTF-8 locale is > pretty serious, so I want to make a bug-fix release. > > Does anyone have a pending change or a bug report that we should > consider first?
For reference, here are the pending NEWS entries: ** Bug fixes grep -i, in a multi-byte locale, when matching a line containing a character like the UTF-8 Turkish I-with-dot (U+0130) (whose lower-case representation occupies fewer bytes), would print an incomplete output line. Similarly, with a matched line containing a character (e.g., the Latin capital I in a Turkish UTF-8 locale), where the lower-case representation occupies more bytes, grep could print garbage. [bug introduced in grep-2.6] --include and --exclude can again be combined, and again apply to the command line, e.g., "grep --include='*.[ch]' --exclude='system.h' PATTERN *" again reads all *.c and *.h files except for system.h. [bug introduced in grep-2.6] ** New features 'grep' without -z now treats a sparse file as binary, if it can easily determine that the file is sparse. ** Dropped features Bootstrapping with Makefile.boot has been broken since grep 2.6, and was removed.
