The other avenue you can explore is using a range query (cts:element-range-query and cts:element-attribute-range-query). These require range indexes on the supplied element or attribute, and they match according to collation order (for strings). You might be able to get the semantics you are looking for by doing a < or a > (or one of the others)in the range-query operator.
Also, if you are using 3.2, the index advise I gave you is not applicable. Maybe that will be a good reason for you to upgrade to 4.1.... I am not sure what you mean about 1 or 2 char "lexicons". Presumably you mean wildcards. The lexicon queries are the ones that return values right out of the range indexes (cts:element-values, et al). Like I said, in 4.1-3 and later, you should not need those as long as you have 3-char wildcard and a codepoint word lexicon enabled. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Dominic Beesley Sent: Wednesday, February 10, 2010 2:23 AM To: 'General Mark Logic Developer Discussion' Subject: RE: [MarkLogic Dev General] element-value-query with wildcards Thanks to everyone who has replied, I can't believe I missed that last paragraph in the documentation. I'll have a play with doing a word search instead of a value search as well. Though I would like the items to be in the right order... I could always filter the results afterwards. We're still on ML 3.* but I'll follow the advice and knock back the indexing options - it was really just desperation! What I'm not sure about is will I still need the one/two char lexicons to search for smaller items. This element can contain any old numbers in the string so could be for instance 1/89/EC - would that work better with "one" and "two character lexicons" to match with "1 89 *" ? In this application there will be hundreds of such searches a second across a large database, speed is likely more important than keeping the size down. Also I'm not sure that the combination of the two parts of the search is optimal - is there a better way of specifying an element, with a particular attribute AND a particular value. Thanks again for the help Dom From: [email protected] [mailto:[email protected]] On Behalf Of Danny Sokolsky Sent: 09 February 2010 19:07 To: General Mark Logic Developer Discussion Subject: RE: [MarkLogic Dev General] element-value-query with wildcards Dom, Doug's suggestion of changing the search string to match multiple words is a good one. For example, consider the following query: let $x := <a>123/456/abc</a> return ( cts:contains($x, cts:element-value-query(xs:QName("a"), "123* *", ("wildcarded"))), cts:contains($x, cts:element-value-query(xs:QName("a"), "123*", ("wildcarded"))) ) => true false Another thing: you mentioned that you turned on all wildcard indexes. That is probably not your best bet. I would recommend the following (assuming you are using 4.1-3 or greater) index settings for a good balance between good wildcard performance and database size: stemmed searches word searches word positions three character searches three character word positions word lexicon (unicode collation) You can add other options as needed (such as case-sensitive and diacritic-sensitive), but for most content sets, you should not need the 2 and 1 character searches or the trailing wildcards if you have these options enabled. -Danny From: [email protected] [mailto:[email protected]] On Behalf Of Glidden, Douglass A Sent: Tuesday, February 09, 2010 10:27 AM To: [email protected] Subject: RE: [MarkLogic Dev General] element-value-query with wildcards Dom, I would guess that this issue has to do with word boundaries. When you do an element-value-query for '139*', you are searching for only elements whose values contain a single word starting with '139' (because '*' does not match on word boundaries), but the slashes in your desired value are probably being interpreted as word boundaries. There are a couple of options here, I think. One would be to change the search string in the element-value-query to '139* *'. That should match one or more words with the first word starting with '139'. Another option would be to keep the string the same but use an element-word-query, but that will also return positive for values like '890/1396/ZZ', where any "word" in the element starts with 139. Doug Glidden Software Engineer The Boeing Company [email protected] ________________________________ From: [email protected] [mailto:[email protected]] On Behalf Of Dominic Beesley Sent: Tuesday, February 09, 2010 13:09 To: [email protected] Subject: [MarkLogic Dev General] element-value-query with wildcards Hello, I've been trying to get the following search working: cts:and-query( ( cts:element-attribute-value-query(xs:QName("xm:field"), xs:QName("type"), "number") , cts:element-value-query(xs:QName("xm:field"),('139*'),("wildcarded")) ) Search for <xm:field type='number'> element containing something starting with 139. However I don't seem to be able to get this to work. It doesn't return all the values in my data. It brings back: "1398" but I know there is also a value "139/2004/EC" which I'd also like it to find. If I change this to: cts:and-query( ( cts:element-attribute-value-query(xs:QName("xm:field"), xs:QName("type"), "number") , cts:element-value-query(xs:QName("xm:field"),('139/2004/EC'),("wildcarded")) ) It works but I need to be able to find both at once if possible. I've set all the wildcard options to true on the database and added a "element word lexicon" and reindexed. Will this not work? Do I need a different type of lexicon, range? Any help gratefully received Cheers Dom
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
