The replace function is documented here 
http://www.w3.org/TR/xpath-functions/#func-replace
Which refers to Regular Expressions here
http://www.w3.org/TR/xpath-functions/#regex-syntax
Which references Unicode Regular Expressions
http://www.unicode.org/reports/tr18/

Which after intense study and getting your PhD in Regular Expressions , you 
will find that 
\p{ xxx } means "Property xxx" and "L" is for normal letters so \p{L} matches 
all normal letters in all unicode sets.



-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
d...@marklogic.com
Phone: +1 650-287-2531
Cell:  +1 812-630-7622
www.marklogic.com

This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.

> -----Original Message-----
> From: general-boun...@developer.marklogic.com [mailto:general-
> boun...@developer.marklogic.com] On Behalf Of Rajasekaran, Santhosh
> Sent: Monday, May 14, 2012 2:24 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Need to Remove spaces, punctuations,
> parens and ect., from the given string (need to remove all character other 
> than
> A-Z and 0-9)
> 
> Hi rob,
> 
> Thanks, it woked, Can you brief the meaning of the regex "[^\p{L}|\p{N}]+". It
> will be great if you could share some reference document regarding regex.
> 
> Thanks & Regards,
> Santhosh
> 
> ________________________________________
> From: general-boun...@developer.marklogic.com [general-
> boun...@developer.marklogic.com] On Behalf Of Whitby, Rob, Springer
> Healthcare UK [rob.whi...@springer.com]
> Sent: Friday, May 11, 2012 8:50 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Need to Remove spaces, punctuations,
> parens and ect.,        from the given string (need to remove all character 
> other
> than A-Z      and 0-9)
> 
> I had to do something similar - try this:
> 
> let $string := "Peña, replaces dia char"
> return replace($string, '[^\p{L}|\p{N}]+', '')
> 
> 
> -----Original Message-----
> From: general-boun...@developer.marklogic.com [mailto:general-
> boun...@developer.marklogic.com] On Behalf Of Rajasekaran, Santhosh
> Sent: 11 May 2012 13:08
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Need to Remove spaces, punctuations,
> parens and ect., from the given string (need to remove all character other 
> than
> A-Z and 0-9)
> 
> Hi Jakob,
> 
> Thanks, Ya it worked well, but it also removes the diacritics character also.
> If i do not want to remove diacritics the what to do?
> 
> Eg:  -->
> let $string := "Peña, replaces dia char"
> return replace($string, '[^a-zA-Z]+', '')
> ==> Peareplacesdiachar
> 
> I need "Peñareplacesdiachar"
> 
> Thanks
> Santhosh
> 
> 
> 
> ________________________________________
> From: general-boun...@developer.marklogic.com [general-
> boun...@developer.marklogic.com] On Behalf Of Jakob Fix
> [jakob....@gmail.com]
> Sent: Thursday, May 10, 2012 10:22 AM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] Need to Remove spaces, punctuations,
> parens and ect., from the given string (need to remove all character other 
> than
> A-Z and 0-9)
> 
> Hi,
> 
> it's probably easier if you declare the character groups you want to
> keep and excluding everything else, like so:
> 
> let $string := "AB cd/EF;gh"
> return replace($string, '[^a-zA-Z]+', '') (: everything that's not an
> alphabetical character will be replaced :)
> ==> "ABcdEFgh"
> 
> cheers,
> Jakob.
> 
> 
> On Thu, May 10, 2012 at 4:00 PM, Rajasekaran, Santhosh
> <santhosh.rajaseka...@hmhpub.com> wrote:
> > Hi Folks,
> >
> >
> >
> >                 I have the below requirement in Xquery.
> >
> >
> >
> > Given a string I need to remove spaces, punctuation, parens and etc.,
> > (I.e)except alpha(A-Z or a-z) and numeric 0-9
> >
> >
> >
> > Eg:
> >
> >
> >
> > Input                                       Expected Output
> >
> >
> >
> > San & co.,                              Sanco
> >
> > It is a string                            Itisastring
> >
> > New (value)                          Newvalue
> >
> > At,the hill +  school             Atthehillschool
> >
> > Oh!.. is it, I don't know       OhisitIdontknow
> >
> >
> >
> > Please let me know how do I achieve this. Do I need to add all this
> > characters (spaces,punctuation,parens and etc., in regular expression and
> > replace that one by one) using fn:replace() function.
> >
> > Or
> >
> > Do we have any other better suggestion?
> >
> >
> >
> > Thanks & Regards,
> >
> > Santhosh
> >
> >
> > _______________________________________________
> > General mailing list
> > General@developer.marklogic.com
> > http://developer.marklogic.com/mailman/listinfo/general
> >
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to