On 4/8/09, Jennifer Pioch <[email protected]> wrote: > On 3/25/09, Glenn Fowler <[email protected]> wrote: > > > > On Wed, 25 Mar 2009 17:08:11 +0100 Jennifer Pioch wrote: > > > On 3/24/09, Glenn Fowler <[email protected]> wrote: > > > > > > > > here's what the regexp page says: > > > > > > > > The Simple Regular Expressions described below differ from the > > > > Internationalized Regular Expressions described on the regex(5) > manual > > > > page in the following ways: > > > > > > > > * only Basic Regular Expressions are supported > > > > * the Internationalization features--character class, equivalence > class, > > > > and multi-character collation--are not supported. > > > > > > > > if these are indeed the only differences then I can add a REG_NOI18N > > > > regcomp() flag -- but I need verification of exactly what that means > > > > does that mean that it is byte based, or does . match a multibyte > char? > > > > > It supports and matches multibyte characters, only supports Basic > > > Regular Expressions and does not support the extended set of character > > > *classes*. > > > The name REG_NOI18N would be misleading, its better to call it > > > REG_REGEXP (basic regexp). > > > > > > I was concerned about multibyte because the regexp text refers to byte > > instead of character in some places > > > Solaris regexp matches multibyte characters, but a REG_BINARY option > to match in singlebyte characters may be useful for matching binary > data. > > > > can you verify how the regexp grep works in the C and multibyte locales > > try with a file that has one line with one multibyte char > > and try this pattern > > '^.$' > > > $ printf "a\nä\nb\n" | /usr/bin/grep '^.$' > a > ä > > b
Glenn, are you going to make the changes to regex? Jenny -- Jennifer Pioch, Uni Frankfurt _______________________________________________ ast-users mailing list [email protected] https://mailman.research.att.com/mailman/listinfo/ast-users
