On 3/25/09, Glenn Fowler <[email protected]> wrote: > > On Wed, 25 Mar 2009 17:08:11 +0100 Jennifer Pioch wrote: > > On 3/24/09, Glenn Fowler <[email protected]> wrote: > > > > > > here's what the regexp page says: > > > > > > The Simple Regular Expressions described below differ from the > > > Internationalized Regular Expressions described on the regex(5) manual > > > page in the following ways: > > > > > > * only Basic Regular Expressions are supported > > > * the Internationalization features--character class, equivalence > class, > > > and multi-character collation--are not supported. > > > > > > if these are indeed the only differences then I can add a REG_NOI18N > > > regcomp() flag -- but I need verification of exactly what that means > > > does that mean that it is byte based, or does . match a multibyte char? > > > It supports and matches multibyte characters, only supports Basic > > Regular Expressions and does not support the extended set of character > > *classes*. > > The name REG_NOI18N would be misleading, its better to call it > > REG_REGEXP (basic regexp). > > > I was concerned about multibyte because the regexp text refers to byte > instead of character in some places
Solaris regexp matches multibyte characters, but a REG_BINARY option to match in singlebyte characters may be useful for matching binary data. > can you verify how the regexp grep works in the C and multibyte locales > try with a file that has one line with one multibyte char > and try this pattern > '^.$' $ printf "a\nä\nb\n" | /usr/bin/grep '^.$' a ä b Jenny -- Jennifer Pioch, Uni Frankfurt _______________________________________________ ast-users mailing list [email protected] https://mailman.research.att.com/mailman/listinfo/ast-users
