Thank you all for your contributions, but (I seem to have a deficiency in the field of collations, and I'm sorry for flooding the list in an attempt to fill this void):
let $options := <search:options xmlns="http://marklogic.com/appservices/search"> <default-suggestion-source> <word> <field name="suggest-field" collation="http://marklogic.com/collation/en/S1"/> </word> </default-suggestion-source> <term> <term-option>case-insensitive</term-option> </term> </search:options> return search:suggest("health", $options, 5) ==> Health healthcare healthier healthiness healthrelated two problems: sentence-case Health, and ignored hyphen in health-related. for the latter problem, I added the collation http://marklogic.com/collation/en/S4 to the field specification, I now have two collations /en/S1 and /en/S4 - unfortunately this doesn't return health-related, but still healthrelated. Also, I have no clue as to why it still returns Health instead of health. How can I specify two collations in my query in order to have results returned that are both case-insensitive and respect punctuation? (OK, I have now switched to the non-language-specific //S1 and //S4 following Mary's suggestion. The problems persist,) feeling dafter by the minute ... Jakob. On Thu, Apr 12, 2012 at 00:26, Jakob Fix <[email protected]> wrote: > Will, Colleen, > > ah I knew somehow that collations might had something to do with it! > The search dev guide indicates that > http://marklogic.com/collation/en/S1 is the one to choose for case and > diacritic insensitive searches. > > I replaced the previous default collation with this collation (it's > re-indexing as I write). Now that it has finished re-indexing, I'm > getting a > > XDMP-FIELDLXCNNOTFOUND: > cts:field-word-match(xs:NCName("suggest-field"), "env*", "document", > (), xs:double("1"), ()) -- No field word lexicon for suggest-field > http://marklogic.com/collation > > error, but the new collation is the only one attached to this field. > Where does the error message take the default collation from? Or do I > need both collations? I added the collation attribute to > <default-suggestion-source > collation="http://marklogic.com/collation/en/S1"> and also added the > line > > declare default collation "http://marklogic.com/collation/en/S1"; > > to the start of the query, all to no avail. > > And then I simply tried the (undocumented?) @collation attribute for > the <field> element ... tada! This worked: > > env > envahissantes > envahisseur > envejece > envejecen > > Well, this opens another question: the collation is for English, but I > seem to have multiple languages (French at least, and the last two > words look "different"), so I guess I would just add more collations > for all languages we may have (provided a license key?). For the > single purpose of rendering different case irrelevant, would this one > collation be sufficient? > > > Thanks for prodding me in the right direction. > > cheers, > Jakob. > > > > On Wed, Apr 11, 2012 at 23:55, Colleen Whitney > <[email protected]> wrote: >> I think a case-insensitive collation might accomplish this.... >> >> Colleen Whitney >> MarkLogic Corporation >> >> Phone +1 650 655 2366 >> email [email protected] >> web www.marklogic.com >> >> This e-mail and any accompanying attachments are confidential. The >> information is intended solely for the use of the individual to whom it is >> addressed. Any review, disclosure, copying, distribution, or use of this >> e-mail communication by others is strictly prohibited. If you are not the >> intended recipient, please notify us immediately by returning this message >> to the sender and delete all copies. Thank you for your cooperation. >> >> ________________________________________ >> From: [email protected] >> [[email protected]] On Behalf Of Jakob Fix >> [[email protected]] >> Sent: Wednesday, April 11, 2012 2:38 PM >> To: General Mark Logic Developer Discussion >> Subject: [MarkLogic Dev General] case sensitivity and search:search >> >> Hello, >> >> my goal is to search a couple of elements for the type-ahead (aka >> search:suggest) and to return suggestions. I've created a word field >> index (called "suggest-field") based on the two elements. >> >> The search has to be case insensitive, i.e. currently I get results like >> this: >> env >> ENV >> envahissantes >> envahisseur >> envejece >> >> for the first two, I want only "env", not "ENV" or a potential "Env". >> I've disabled "fast case sensitive searches" on the database level as >> otherwise this was inherited by the field configuration. I have the >> basic collation (http://marklogic.com/collation/). Also, I'm using a >> term-option set to "case-insensitive" (see sample query below). But >> none of these options make that the search considers different case >> irrelevant. >> >> Which knob do I have to twiddle? >> >> let $options := >> <search:options xmlns="http://marklogic.com/appservices/search"> >> <default-suggestion-source> >> <word> >> <field name="suggest-field"/> >> </word> >> </default-suggestion-source> >> <term> >> <term-option>case-insensitive</term-option> >> </term> >> </search:options> >> return >> >> search:suggest("env", $options, 5) >> >> cheers, >> Jakob. >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
