Indar,

There are two kinds of options here:

1) using fields or word query to boost the weight of sections of documents. I 
like this approach the best because relevance isn't really black and white.
2) Programming your own relevance scheme using query weights and score-simple.

The first is pretty simple. You can configure word query or a field (similar 
configuration approach) to assign weights to elements. The relevance score of a 
section will be boosted by this pre-defined weight, including negative boost, 
or exclusion from searches. If you're configuring word query, it means all word 
queries operate this way, which means you can use the SearchAPI as you would 
normally and everything "just works". However, with this approach, you cannot 
ensure that matches in a specific section will always rank above matches in 
other sections. It sounds like this may be the case - I personally think this 
is not a great design, but that decision may be in the hands of others.

The second approach is more sophisticated. You issue a big OR query of each of 
the different buckets you wish to search:

A: document name
B: metadata
C: document body

You would use the "score-simple" option in your call to cts:search. This 
ignores document frequency, and term frequency - if a word matches one or more 
times, it counts as one match in the scoring. You would also assign weights to 
each of your sub-queries - A, B, C - so that if there's a match (one or more), 
the weight is the deciding factor. 

So, what if it matches in multiple places - document name, metadata, AND body? 
The scores are additive. Construct your weights so that they follow 2^N so that 
each instance is greater than the sum of its predecessors: 16 > (8 + 4 + 2 + 
1/2 + 1/4 + 1/8 + 1/16). The maximum effective value of a weight is +/- 16. 
This will ensure that matches "stack rank" according to your rules.

Kelly 

Message: 1
Date: Sun, 26 Jun 2011 01:16:27 +0530 (IST)
From: indar verma <[email protected]>
Subject: [MarkLogic Dev General] Displaying results as per priority
        (assigning weightage to search results)
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="utf-8"

Hi All,

I have to implement  below logic:
it says:
a. First Look in document name and give Priority 1 - regardless 
b. Then Look in metadata 
c.  Third Look in full text
        i.      If it occurs in metadata only (Priority 2)
        ii.      If it occurs in metadata and full text (Priority 3)
        iii.      If it occurs in full text only (Priority 4 ? ranked by most 
occurrences
My XML structure is like that: 
 book1.xml
<book>
 <meta>
<field field-name=?document?>my doc</field> <field field-name=?title?>This is a 
my doc<field> ???..
</meta>
    <heading>Another1 abc</heading> 
    <para> 
        <text>This is part 2 of first document...</text> 
    </para>
</book>
book2.xml
<book>
 <meta>
<field field-name=?document?>my document</field> <field field-name=?title?>This 
is a my document<field> ???..
</meta>
    <heading>Another document</heading> 
    <para> 
        <text>This is part 2 of first document...</text> 
    </para>
</book>
book3.xml
<book>
 <meta>
<field field-name=?document?>my doc</field> <field field-name=?title?>This is a 
my doc<field> ???..
</meta>
    <heading>Another3 abc</heading> 
    <para> 
        <text>This is part 2 of first document...</text> 
    </para>
<para> 
        <text>This is part 3 of first document...</text> 
    </para>
</book>
as per my understanding of xQuery,we need to build query for metadata and 
text.so I built following xquery.
declare function local:build-query($text as xs:string) {
  cts:or-query(( 
    cts:word-query($text, (), 1), 
    cts:element-word-query( 
      xs:QName("field"), $text, (), 2 ))) }; let $query := 
local:build-query("document") for $i in cts:search(/book, 
$query,("unfiltered","score-simple"))
return $i/heading

Please some one help me to optimize this code or if my approach is wrong then 
suggest me the logic/code.

Thanking in advance,
inji
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to