Re: [MarkLogic Dev General] General Digest, Vol 84, Issue 90

indar verma Mon, 27 Jun 2011 06:43:52 -0700

Hi Kelly,

if keyword matches in multiple places - document name, metadata, AND body? then 
I have to show that result on the top.


I can create a query and will give weight to specific query with "simple-score" 
what if keyword does not found in metadata and I have to show the results as 
per 
document frequency, and term frequency ?

Thanks & Regards,
Indar
================================
Indar


There are two kinds of options here:


1) using fields or word query to boost the weight of sections of documents. I
like this approach the best because relevance isn't really black and white.
2) Programming your own relevance scheme using query weights and score-simple.


The first is pretty simple. You can configure word query or a field (similar
configuration approach) to assign weights to elements. The relevance score of a
section will be boosted by this pre-defined weight, including negative boost, or
exclusion from searches. If you're configuring word query, it means all word
queries operate this way, which means you can use the SearchAPI as you would
normally and everything "just works". However, with this approach, you cannot
ensure that matches in a specific section will always rank above matches in
other sections. It sounds like this may be the case - I personally think this is
not a great design, but that decision may be in the hands of others.


The second approach is more sophisticated. You issue a big OR query of each of
the different buckets you wish to search:


A: document name
B: metadata
C: document body


You would use the "score-simple" option in your call to cts:search. This ignores
document frequency, and term frequency - if a word matches one or more times, it
counts as one match in the scoring. You would also assign weights to each of
your sub-queries - A, B, C - so that if there's a match (one or more), the
weight is the deciding factor. 


So, what if it matches in multiple places - document name, metadata, AND body?
The scores are additive. Construct your weights so that they follow 2^N so that
each instance is greater than the sum of its predecessors: 16 > (8 + 4 + 2 + 1/2
+ 1/4 + 1/8 + 1/16). The maximum effective value of a weight is +/- 16. This
will ensure that matches "stack rank" according to your rules.


Kelly 



________________________________
From: "[email protected]" 
<[email protected]>
To: [email protected]
Sent: Mon, 27 June, 2011 12:30:02 AM
Subject: General Digest, Vol 84, Issue 90

Send General mailing list submissions to
    [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
    http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
    [email protected]

You can reach the person managing the list at
    [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of General digest..."


Today's Topics:

   1. Displaying results as per priority    (assigning weightage to
      search results) (indar verma)


----------------------------------------------------------------------

Message: 1
Date: Sun, 26 Jun 2011 01:16:27 +0530 (IST)
From: indar verma <[email protected]>
Subject: [MarkLogic Dev General] Displaying results as per priority
    (assigning weightage to search results)
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="utf-8"

Hi All,

I have to implement  below logic:
it says:
a. First Look in document name and give Priority 1 - regardless
b. Then Look in metadata
c.  Third Look in full text
        i.      If it occurs in metadata only (Priority 2)
        ii.      If it occurs in metadata and full text (Priority 3)
        iii.      If it occurs in full text only (Priority 4 ? ranked by most 
occurrences
My XML structure is like that: 
book1.xml
<book>
<meta>
<field field-name=?document?>my doc</field>
<field field-name=?title?>This is a my doc<field>
???..
</meta>
    <heading>Another1 abc</heading> 
    <para> 
        <text>This is part 2 of first document...</text> 
    </para> 
</book>
book2.xml
<book>
<meta>
<field field-name=?document?>my document</field>
<field field-name=?title?>This is a my document<field>
???..
</meta>
    <heading>Another document</heading> 
    <para> 
        <text>This is part 2 of first document...</text> 
    </para> 
</book>
book3.xml
<book>
<meta>
<field field-name=?document?>my doc</field>
<field field-name=?title?>This is a my doc<field>
???..
</meta>
    <heading>Another3 abc</heading> 
    <para> 
        <text>This is part 2 of first document...</text> 
    </para> 
<para> 
        <text>This is part 3 of first document...</text> 
    </para> 
</book>
as per my understanding of xQuery,we need to build query for metadata and 
text.so I built following xquery.
declare function local:build-query($text as xs:string) { 
  cts:or-query(( 
    cts:word-query($text, (), 1), 
    cts:element-word-query( 
      xs:QName("field"), $text, (), 2 )))  
}; 
let $query := local:build-query("document")
for $i in cts:search(/book, $query,("unfiltered","score-simple"))
return $i/heading

Please some one help me to optimize this code or if my approach is wrong then 
suggest me the logic/code.

Thanking in advance,
inji
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20110626/3423d24a/attachment-0001.html
 

------------------------------

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general


End of General Digest, Vol 84, Issue 90
***************************************

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] General Digest, Vol 84, Issue 90

Reply via email to