Yes, under "alternate characters" select "Ignore punctuation". If the collation builder does not allow you to specify what you want, you can just use the rules in the doc to create a URI that is exactly what you want. A good way to test the collation URI is to use cq and write a small program that declares a default collation and then does some string comparison. For example, using the collation builder I built a case and diacritic-insensitive, plus ignore punctuation collation and then I tried the following:
xquery version "1.0-ml"; declare default collation "http://marklogic.com/collation/en/S1/T00BB/AS"; "foo, bar" eq "foo bar" (: return true :) -Danny From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Tim Meagher Sent: Friday, April 30, 2010 12:19 PM To: 'General Mark Logic Developer Discussion' Subject: Re: [MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive? There's an explicit option for a punctuation-sensitive index in the collation builder, but not for a punctuation-insensitive index. If not specified, is it implicitly punctuation-insensitive? If not implicit, then does one need to adjust the alternate character settings? ________________________________ From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Danny Sokolsky Sent: Friday, April 30, 2010 3:05 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive? Yes, build a range index with a case-insensitive, diacritic-insensitive, punctuation-insensitive collation. You can look in the "Encodings and Collations" chapter of the Search Developer's guide to figure out the collation URI, or you can use the little widget in the Admin Interface to build the right collation URI (which will probably be easier...). The widget is in a couple of places (I think on the database config page and on the App Server config page)-it is called "collation builder". -Danny From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Tim Meagher Sent: Friday, April 30, 2010 10:59 AM To: 'General Mark Logic Developer Discussion' Subject: [MarkLogic Dev General] How to make the cts::element-value-match search punctuation-insensitive? Hi folks, I have a query that seems to be kind of slow - it uses cts:element-value-query() with the case-insensitive, diacritic-insensitive, and punctuation-insensitive options. However, I want to speed up the search by creating a range element index on the elements of interest. I noticed, however, that the corresponding lexicon query would be cts::element-value-match(), but it only provides case-insensitive and diacritic-insensitive search options but not the punctuation-insensitive option. How can I make this lexicon query punctuation-insensitive? Can I do it by building a custom collation with the alternate characters setting set to "avoid punctuation"? Thank you! Tim Meagher
_______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general