On 07/22/2010 12:49 PM, Mihai Moldovan wrote: > Hi, > > I have come to notice that cut is not yet multi byte/wide char aware.
Yes, and so are a lot of the coreutils. This is a well-known issue, and mentioned in the TODO. Several distros have add-on patches that add wide char support, but to date, no one has yet submitted a patch upstream that is both easy to maintain (doesn't needlessly duplicate big blocks of code over char vs. wchar_t) and which doesn't penalize speed on single-byte locales. We've got some ideas on what is needed, and gnulib is certainly getting closer to what we need (Bruno's work on libunistring will be a key player in an acceptable patch), but it takes time to pull it all together. > (Is this even considerable as a bug, or just a "feature" in that only > one byte delimiters are allowed by default?) Yes, it can be considered a bug, and any extra help would be welcome. Unfortunately, to date there has been no one willing to step forward to scratch this itch as their highest priority. -- Eric Blake [email protected] +1-801-349-2682 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
