On 28 June 2012 19:00, Frank Chang <frankchan...@gmail.com> wrote: > Good afternoon, We are trying to match the German string. Munich > tausendschöne Jungfräulein ausendschçne, using a C/C++ PCRE regex with > PCRE_UTF8, PCRE_UCP, PCRE_CASELESS options activated which uses the UTF-8 > literals, ö, ä, ç Is it possible to construct a valid PCRE regex which uses > the UTF-8 literals ö or ä or ç without using codepoints?
It *is* possible, but you must ensure that the execution charset of your compiler is set to properly output UTF-8 sequences. Is it the case? Try getting an hex dump of the string literal you're passing to pcre_compile (eventually, try looking at the assembler output). Cheers, -- Giuseppe D'Angelo -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev