On Sun, Mar 06, 2016 at 10:38:41PM +0100, Pierre Labastie wrote: > On 06/03/2016 21:19, Bruce Dubbs wrote: > > The most recent version of grep is causing problems. If it processes a file > > that has a character that is not in the LANG, it stops processing and > > outputs > > "Binary file <name> matches". > > > > This problem came up in building lxqt as the .desktop files have a lot of > > characters for different languages. I think these are all utf-8, but I'm > > not > > sure. > > > > There are several ways to work around this problem. > > > > 1. export GREP_OPTIONS=--text or it's equivalent GREP_OPTIONS=-a > > > > The man pages says > > > > GREP_OPTIONS > > This variable specifies default options to be placed in front of any > > explicit options. As this causes problems when writing portable scripts, > > this > > feature will be removed in a future release of grep, and grep warns if it is > > used. Please use an alias or script instead. > > > > 2. I don't think an alias would be a good option as that would not be picked > > up in scripts. > > > > alias grep='grep -a' > > > > 3. We could create a script like yacc. > > > > cd /bin > > mv grep grep.orig > > cat >> grep << EOF > > #! /bin/sh > > exec /bin/grep.orig --text "$@" > > EOF > > > > But there are times when we do not want the --text behavior. > > > > 4. export LANG=en_US.utf8 where necessary. > > > > The problem here is trying to pick up all the places where it is necessary. > > If a user already has LANG set to a value like fr_FR.utf8, I don't think it > > would be needed. It also would not solve the problem if there are non-utf8 > > characters in the file being searched. > > > > I'll note that I have already addressed this in > > > > http://www.linuxfromscratch.org/blfs/view/svn/postlfs/cacerts.html > > > > where I had to add export LANG=en_US.utf8 to /usr/bin/make-ca.sh. > > > > For right now, I'm going to go with 4, but am not totally happy with that > > solution. > > > > Feedback appreciated. > > > > I realize that, although I set LANG=fr_FR.UTF-8, jhalfs sets LC_ALL=C (this > can be changed, but I just realize this now). That may be why I got issues > with lxqt .desktop files. OTOH, from man grep: > --------------------- > Within a bracket expression, a range expression consists of two characters > separated by a hyphen. It matches any single character that sorts between > the two characters, inclusive, using the locale's collating sequence and > character set. For example, in the default C locale, [a-d] is equivalent > to [abcd]. Many locales sort characters in dictionary order, and in these > locales [a-d] is typically not equivalent to [abcd]; it might be equivalent > to [aBbCcDd], for example. To obtain the traditional interpretation of > bracket expressions, you can use the C locale by setting the LC_ALL > environment variable to the value C. > -------------------- > So if a package build system relies on the LC_ALL=C behavior, "4" could lead > to issues... > > But I do not have a better alternative to propose. > As a general rule, I think 4 is probably the best option - perhaps we could have a copy member to explain it, and in that we could mention that any installed UTF-8 locale should be fine ?
ĸen -- This email was written using 100% recycled letters. -- http://lists.linuxfromscratch.org/listinfo/blfs-dev FAQ: http://www.linuxfromscratch.org/blfs/faq.html Unsubscribe: See the above information page