Hi Stephen,

I think if you made the weight of the e-w-q on "line" be 1.0 instead of 0.0, 
that might do the trick.  A weight of 0 says that it does not contribute to the 
score, and it sounds like you want it to.  You also might want to play with the 
weights in the other two cts:query constructors, maybe making them slightly 
lower.

Another thing to keep in mind is that the amount of content you have loaded 
will be a factor in the score.  If you are testing this on a small amount of 
content, you might get more exaggerated results then on a large content set.  
This is because the relevance is calculated based on tf/idf, which takes into 
account the number of documents in the database.

There are other things (such as fragmentation) which can affect the relevance 
too, but I would start with playing with the weight values in the cts:query 
constructors.

-Danny

From: [email protected] 
[mailto:[email protected]] On Behalf Of Stephen Bennett
Sent: Tuesday, August 04, 2009 7:57 AM
To: [email protected]
Subject: [MarkLogic Dev General] Scoring a cts:or-query()

I've got documents in my MarkLogic datastore that contains address information 
in the following format:

<address>
<line>13 Some Street</line>
<line>Some District</line>
<posttown>Sometown</posttown>
<county>SomeCounty</county>
<postcode>SC1 1CS</postcode>
</address>

I want to be able to search across the full address, giving a higher weighting 
to results that match in the posttown or county elements. So I've currently got 
the following cts:or-query() set up:

cts:or-query ((
                        cts:element-word-query(xs:QName('line'), $searchstring, 
('case-insensitive', 'punctuation-insensitive'), 0.0),
                        cts:element-word-query(xs:QName('posttown'), 
$searchstring, ('case-insensitive', 'punctuation-insensitive'), 16.0),
                        cts:element-word-query(xs:QName('county'), 
$searchstring, ('case-insensitive', 'punctuation-insensitive'), 6.0),
))

This almost works fine, however I also want the items that contain more 
instances of the search string in line, posttown and county to appear higher in 
the list of results. Currently if two or more addresses appear in the same 
posttown, they get the same score, even if the county or line also contains the 
search term for one.

Thanks in advance.
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to