Bug#500501: More detailed analysis

2009-11-19 Thread Dmitri Gribenko
Hi, Please see more detailed analysis in bug #555922 that I filed against libc (because in fact sed's regex implementation is based on, or is a copy of libc's one and this bug affects many more packages). In my opinion, the proper solution for sed would be: 1. --binary option should throw sed in

Bug#500501: More detailed analysis

2009-11-19 Thread Paolo Bonzini
In my opinion, the proper solution for sed would be: 1. --binary option should throw sed in a true binary mode without any knowledge of UTF-8 or any other multibyte encodings.  This would allow to process binary files without any UTF-8 logic.  And this would allow direct manipulation of

Bug#500501: More detailed analysis

2009-11-19 Thread Dmitri Gribenko
On Thu, Nov 19, 2009 at 5:18 PM, Paolo Bonzini bonz...@gnu.org wrote: --binary is strictly for Windows support.  There is already one such mode, it's called LANG=C. I didn't think about that. Thanks. From http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html A period ( '.'

Bug#500501: More detailed analysis

2009-11-19 Thread Paolo Bonzini
From http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html A period ( '.' ), when used outside a bracket expression, is a BRE that shall match any character in the supported character set except NUL. My point here is that current implementation of regexes makes '.' NOT