Hi,

It seems that the weight is - at least partly - language dependent.
After changing the xml:lang at the root element of a document, the weight
of the term in the relevance info of that document changes as well.

I'm searching over English and German documents using an or-query:
*cts:or-query((*
*  cts:field-word-query("myfield", $searchterm, "lang=en"),*
*  cts:field-word-query("myfield", $**searchterm*
*, "lang=de")*
*))*

Is the term weight in the relevance info related to the
inverse-document-frequency (IDF)?
Could it be that the IDF of a term is calculated separately in each
language?

How can I prevent that this leads to a boost for documents in one of then
languages?
Example: A user searches for "brain" which occurs in a lot of English
documents but very little German documents. To me it seems that the German
documents containing "brain" are ranked higher because of scoring with IDF.

Regards,
Andreas


2016-12-15 16:36 GMT+01:00 Andreas Hubmer <[email protected]>:

> Hi,
>
> I'm tuning the result order of a search and not sure about all the parts
> in the output of cts:relevance-info.
>
> Example:
> <qry:relevance-info>
>   ...
>   <qry:term *weight="47.75"*>
>     <qry:score formula="8**weight**logtf" computation="382*20">7640</
> qry:score>
>     <qry:key>12437021743613916800</qry:key>
>   </qry:term>
> </qry:relevance-info>
>
> What is the weight marked in bold? How is it influenced?
> The term query in my example is a field-word-query.
>
> Thanks,
> Andreas
>
> --
> Andreas Hubmer
> Senior IT Consultant
>
> EBCONT enterprise technologies GmbH
> Millennium Tower
> Handelskai 94-96
> A-1200 Vienna
>
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to