Thanks to everyone who has replied, I can't believe I missed that last paragraph in the documentation.
I'll have a play with doing a word search instead of a value search as well. Though I would like the items to be in the right order... I could always filter the results afterwards. We're still on ML 3.* but I'll follow the advice and knock back the indexing options - it was really just desperation! What I'm not sure about is will I still need the one/two char lexicons to search for smaller items. This element can contain any old numbers in the string so could be for instance 1/89/EC - would that work better with "one" and "two character lexicons" to match with "1 89 *" ? In this application there will be hundreds of such searches a second across a large database, speed is likely more important than keeping the size down. Also I'm not sure that the combination of the two parts of the search is optimal - is there a better way of specifying an element, with a particular attribute AND a particular value. Thanks again for the help Dom From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: 09 February 2010 19:07 To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] element-value-query with wildcards Dom, Doug's suggestion of changing the search string to match multiple words is a good one. For example, consider the following query: let $x := <a>123/456/abc</a> return ( cts:contains($x, cts:element-value-query(xs:QName("a"), "123* *", ("wildcarded"))), cts:contains($x, cts:element-value-query(xs:QName("a"), "123*", ("wildcarded"))) ) => true false Another thing: you mentioned that you turned on all wildcard indexes. That is probably not your best bet. I would recommend the following (assuming you are using 4.1-3 or greater) index settings for a good balance between good wildcard performance and database size: stemmed searches word searches word positions three character searches three character word positions word lexicon (unicode collation) You can add other options as needed (such as case-sensitive and diacritic-sensitive), but for most content sets, you should not need the 2 and 1 character searches or the trailing wildcards if you have these options enabled. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Glidden, Douglass A Sent: Tuesday, February 09, 2010 10:27 AM To: [email protected] Subject: RE: [MarkLogic Dev General] element-value-query with wildcards Dom, I would guess that this issue has to do with word boundaries. When you do an element-value-query for '139*', you are searching for only elements whose values contain a single word starting with '139' (because '*' does not match on word boundaries), but the slashes in your desired value are probably being interpreted as word boundaries. There are a couple of options here, I think. One would be to change the search string in the element-value-query to '139* *'. That should match one or more words with the first word starting with '139'. Another option would be to keep the string the same but use an element-word-query, but that will also return positive for values like '890/1396/ZZ', where any "word" in the element starts with 139. Doug Glidden Software Engineer The Boeing Company [email protected] _____ From: [email protected] [mailto:[email protected]] On Behalf Of Dominic Beesley Sent: Tuesday, February 09, 2010 13:09 To: [email protected] Subject: [MarkLogic Dev General] element-value-query with wildcards Hello, I've been trying to get the following search working: cts:and-query( ( cts:element-attribute-value-query(xs:QName("xm:field"), xs:QName("type"), "number") , cts:element-value-query(xs:QName("xm:field"),('139*'),("wildcarded")) ) Search for <xm:field type='number'> element containing something starting with 139. However I don't seem to be able to get this to work. It doesn't return all the values in my data. It brings back: "1398" but I know there is also a value "139/2004/EC" which I'd also like it to find. If I change this to: cts:and-query( ( cts:element-attribute-value-query(xs:QName("xm:field"), xs:QName("type"), "number") , cts:element-value-query(xs:QName("xm:field"),('139/2004/EC'),("wildcarded")) ) It works but I need to be able to find both at once if possible. I've set all the wildcard options to true on the database and added a "element word lexicon" and reindexed. Will this not work? Do I need a different type of lexicon, range? Any help gratefully received Cheers Dom
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
