Чтв, 29 Апр 2010, Raul Miller писал(а):
> On Tue, Apr 27, 2010 at 7:53 PM, bill lam <[email protected]> wrote:
> > regex default to use utf-8 encoding but those htmls use latin-1.
> > Either convert text to utf-8 or set regex to non-utf8 mode.
> >
> >   rxutf8 0
> 
> After reading
>    open'regex'
> and
>    http://www.pcre.org/pcre.txt
> 
> What I thought I would want
>    rxutf8 do_jregex_ 'PCRE_UTF8 23 b. PCRE_NO_UTF8_CHECK'
> 
> Unfortunately, PCRE_NO_UTF8_CHECK is not defined, and when
> I look for its value, I find
> http://read.pudn.com/downloads126/sourcecode/delphi_control/536510/PCRE/pcre.h__.htm
> 
> which suggests
> PCRE_UTF8=: 16b800
> PCRE_NO_UTF8_CHECK=: 16b2000
> 
> So now I know that I am confused.
> 
> Can anyone suggest how I might be able to use pcre's ability to
> recognize word forming utf8 characters without also losing access
> to latin1 content?
> 
> Thanks,

rxutf8 is intended to called as either 'rxutf8 0' or 'rxutf8 1', do
you mean that the constant for enable/disable utf8 option is incorrect
inside jregex?

-- 
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to