On Wed, 23 Apr 2014 17:08:21 -0700, John Zhong <[email protected]> wrote:

> Hi,
>
> I have few questions about the cts:relevance-info result, please see below,
> when doing the search, I use the default scoring method, according to the
> document, it says  formula: log(term frequency) * (inverse document
> frequency). My questions are:
>
> 1, how the logtf = 13 is calculated? In my document, it has only two hits
> for word "Test", in this case, I think tf = term frequency = 2 (?), then
> log(2)=0.3 ? And please note the two hits are from two different elements
> which have different weight setting in the database>Word Query setting,
> because I want some elements returned higher score.

We use integer arithmetic for all scoring calculations, and tables for
much of it, so the logtf values are scaled, capped at minimum and
maximum values, and bucketed. This is why you see otherwise mysterious
constants like 8 and 256 in the score calculations. The minimum (non-zero)
logtf value is 8.

> 2, what is the max weight can we set? I read some document, it says it is
> 16, but also mentions "super weight +64", I tried to set a higher number, I
> found I can set it up to <=67.

We won't stop you passing other values. If you put in something more than 64,
we'll scale it down to the range. We will also not stop you putting in
values between 16 and 64, but the intended usage is to keep a clear separation
between normal weights in the +/-16 range and superboosting which is well
above that (64).

//Mary
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to