On Tue, 6 Apr 2004 18:05:32 -0400
gohaku <[EMAIL PROTECTED]> wrote:

> Hi everyone,
> I have some ( actually many ) records in a Database that  I want to 
> "clean"
> Some of these records contain Unicode Text ( Mostly East-Asian )
> 
> I have tried matching for "\W+" and "\S+" but that is not what I am
> looking for because I would like to keep "&" and "-"
> 
> Thanks in advance.
> -gohaku

Hello. A solution may depend on which contamination
may be mixed in your records.

If contamination is an unassigned code points which shall not be used,
\p{Assigned}+ may be useful.


SADAHIRO Tomoyuki

Reply via email to