Thanks to everyone who has replied, I can't believe I missed that last
paragraph in the documentation.

 

I'll have a play with doing a word search instead of a value search as well.
Though I would like the items to be in the right order... I could always
filter the results afterwards.

 

We're still on ML 3.* but I'll follow the advice and knock back the indexing
options - it was really just desperation!

 

What I'm not sure about is will I still need the one/two char lexicons to
search for smaller items. This element can contain any old numbers in the
string so could be for instance 1/89/EC - would that work better with "one"
and "two character lexicons" to match with "1 89 *" ? In this application
there will be hundreds of such searches a second across a large database,
speed is likely more important than keeping the size down. 

 

Also I'm not sure that the combination of the two parts of the search is
optimal - is there a better way of specifying an element, with a particular
attribute AND a particular value.

 

Thanks again for the help

 

Dom

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Danny Sokolsky
Sent: 09 February 2010 19:07
To: General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General] element-value-query with wildcards

 

Dom,

 

Doug's suggestion of changing the search string to match multiple words is a
good one.  For example, consider the following query:

 

let $x := <a>123/456/abc</a>

return

(

cts:contains($x, cts:element-value-query(xs:QName("a"), "123* *",

               ("wildcarded"))),

cts:contains($x, cts:element-value-query(xs:QName("a"), "123*",

               ("wildcarded")))

)

=> true

     false

 

Another thing:  you mentioned that you turned on all wildcard indexes.  That
is probably not your best bet.  I would recommend the following (assuming
you are using 4.1-3 or greater) index settings for a good balance between
good wildcard performance and database size:

 

stemmed searches

word searches

word positions

three character searches

three character word positions

word lexicon (unicode collation)

 

You can add other options as needed (such as case-sensitive and
diacritic-sensitive), but for most content sets, you should not need the 2
and 1 character searches or the trailing wildcards if you have these options
enabled.

 

-Danny

 

 

 

From: [email protected]
[mailto:[email protected]] On Behalf Of Glidden,
Douglass A
Sent: Tuesday, February 09, 2010 10:27 AM
To: [email protected]
Subject: RE: [MarkLogic Dev General] element-value-query with wildcards

 

Dom,

 

I would guess that this issue has to do with word boundaries.  When you do
an element-value-query for '139*', you are searching for only elements whose
values contain a single word starting with '139' (because '*' does not match
on word boundaries), but the slashes in your desired value are probably
being interpreted as word boundaries.  There are a couple of options here, I
think.  One would be to change the search string in the element-value-query
to '139* *'.  That should match one or more words with the first word
starting with '139'.  Another option would be to keep the string the same
but use an element-word-query, but that will also return positive for values
like '890/1396/ZZ', where any "word" in the element starts with 139.

Doug Glidden 
Software Engineer 
The Boeing Company 
[email protected] 

 

  _____  

From: [email protected]
[mailto:[email protected]] On Behalf Of Dominic
Beesley
Sent: Tuesday, February 09, 2010 13:09
To: [email protected]
Subject: [MarkLogic Dev General] element-value-query with wildcards

Hello,

 

I've been trying to get the following search working:

 

    cts:and-query(

    ( 

        cts:element-attribute-value-query(xs:QName("xm:field"),
xs:QName("type"), "number")

      ,
cts:element-value-query(xs:QName("xm:field"),('139*'),("wildcarded"))

     )

 

Search for <xm:field type='number'> element containing something starting
with 139. However I don't seem to be able to get this to work. It doesn't
return all the values in my data.

 

It brings back:  "1398" but I know there is also a value "139/2004/EC" which
I'd also like it to find.

 

If I change this to:

 

    cts:and-query(

    ( 

        cts:element-attribute-value-query(xs:QName("xm:field"),
xs:QName("type"), "number")

      ,
cts:element-value-query(xs:QName("xm:field"),('139/2004/EC'),("wildcarded"))

     )

It works but I need to be able to find both at once if possible.

 

I've set all the wildcard options to true on the database and added a
"element word lexicon" and reindexed. Will this not work? Do I need a
different type of lexicon, range?

 

Any help gratefully received

 

Cheers

 

Dom

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to