Re: [PHP] preg for unicode strings?

2005-11-08 Thread Richard Lynch
On Sat, November 5, 2005 3:02 pm, Andy Pieters wrote:
 Hi List

 I am doing some data validation and the following regexp fails

 [\W]

 When using characters like £ or €

 Obviously because they are technically more then one character, even
 though
 they are only displayed as one.

 The script is encoded in UTF-8

 Anybody know a fix for this?

You could use http://php.net/utf8_decode on it first, and then
validate...

I dunno if that would allow any nasties to get past, but it least it
should validate the input as legal I think...

I always feel overwhelmed by all this multi-lingual character-encoding
multi-byte stuff, frankly.

-- 
Like Music?
http://l-i-e.com/artists.htm

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] preg for unicode strings?

2005-11-06 Thread Niels Ganser
Andy,

try that one: /^[a-zA-Z]{3}|\p{Sc}$/u

You don't want to put \p{Sc} in square brackets as \p{Sc} itself already 
is a character class. Umm.. Kinda don't make myself clear here, do I? 
You just don't want to, it's 5am in the morning here I gotta go to the 
next bed ;p

Regards,
Niels



Andy Pieters:
 Hi

 Thank you for your reply.

 My regexp was

 /^([a-zA-Z]{3,}|[\W])/

 Meaning match any string that is either
 3 letters
 or
 1 word character

 I'd like to change this to
 3 letters
 or
 1 currency character

 So I changed the regexp accordingly
 /^([a-zA-Z]{3,}|[\p{Sc}])/u

 And I tested with £

 but it fails.

 Any ideas?

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] preg for unicode strings?

2005-11-05 Thread Andy Pieters
Hi List

I am doing some data validation and the following regexp fails

[\W]

When using characters like £ or €

Obviously because they are technically more then one character, even though 
they are only displayed as one.

The script is encoded in UTF-8

Anybody know a fix for this?

With kind regards


Andy


-- 
Now listening to The Prophet - I Can't Stand It on amaroK
Geek code: www.vlaamse-kern.com/geek
Registered Linux User No 379093
If life was for sale, what would be its price?
www.vlaamse-kern.com/sas/ for free php utilities
--


pgpgretkqqhNR.pgp
Description: PGP signature


Re: [PHP] preg for unicode strings?

2005-11-05 Thread Robin Vickery
On 11/5/05, Andy Pieters [EMAIL PROTECTED] wrote:

 I am doing some data validation and the following regexp fails
 [\W]
 When using characters like £ or €
 The script is encoded in UTF-8

Are you using the 'u' modifier to put PCRE in utf-8 mode?

preg_match( '/\W/u', $text);

  -robin


Re: [PHP] preg for unicode strings?

2005-11-05 Thread Niels Ganser
Andy,

you might want to check out 
http://www.regular-expressions.info/unicode.html

Please note two things while using the described syntax:
1. You have to additionally use the u modificator.
2. While \p{Ll} for instance works in PHP, \p{Lowercase_Letter} doesn't.

Regards,
Niels


 Hi List

 I am doing some data validation and the following regexp fails

 [\W]

 When using characters like £ or €

 Obviously because they are technically more then one character, even
 though they are only displayed as one.

 The script is encoded in UTF-8

 Anybody know a fix for this?

 With kind regards


 Andy

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] preg for unicode strings?

2005-11-05 Thread Andy Pieters
Hi 

Thank you for your reply.

My regexp was 

/^([a-zA-Z]{3,}|[\W])/

Meaning match any string that is either
3 letters
or
1 word character

I'd like to change this to
3 letters
or
1 currency character

So I changed the regexp accordingly
/^([a-zA-Z]{3,}|[\p{Sc}])/u

And I tested with £

but it fails.

Any ideas?

With kind regards


Andy

On Sunday 06 November 2005 02:11, Niels Ganser wrote:
 Andy,

 you might want to check out
 http://www.regular-expressions.info/unicode.html

 Please note two things while using the described syntax:
 1. You have to additionally use the u modificator.
 2. While \p{Ll} for instance works in PHP, \p{Lowercase_Letter} doesn't.

 Regards,
 Niels

  Hi List
 
  I am doing some data validation and the following regexp fails
 
  [\W]
 
  When using characters like £ or €
 
  Obviously because they are technically more then one character, even
  though they are only displayed as one.
 
  The script is encoded in UTF-8
 
  Anybody know a fix for this?
 
  With kind regards
 
 
  Andy

-- 
Now listening to Top! Radio Live www.topradio.be/stream on amaroK
Geek code: www.vlaamse-kern.com/geek
Registered Linux User No 379093
If life was for sale, what would be its price?
www.vlaamse-kern.com/sas/ for free php utilities
--


pgpT8ldDDW3eO.pgp
Description: PGP signature