[lucy-user] C library - Scoring mechanism

serkanmula...@gmail.com Mon, 20 Nov 2017 17:09:43 -0800

Hi guys,

I have a question regarding the scoring mechanism for relevancy. Is the scoring 
mechanism tf/idf when the field indexed with the EasyAnalyzer in the schema? 
What happens when multiple terms are used? Are tf/idf's summed? How does the 
incorporate the location of the words to the scoring mechanism for queries with 
multiple words?


How about the fields which has RegexTokenizer? Is it still the same mechanism? 
Does the type of the tokenizer affect the scoring?  I believe the important 
thing is the generated tokens (and not related to the tokenizer), and maybe the 
order of the tokens in a document.

One more thing, if I were to change the scoring mechanism for different fields, 
how can I do it? Are there any predefined mechanisms eg. tf/idf doc2vec etc. Or 
if I want to go further and come up with my own how can I do it?

Thanks,
Serkan

[lucy-user] C library - Scoring mechanism

Reply via email to