Hi everyone, If you are asked to explain how Lucene's algorithm works, to someone who is not technical and doesn't understand math, how do you go about doing so?
I'm going to list what I see as key points to use but please correct me where correction is needed and do add where addition is needed. Here are the talking points I can think of. Search terms are: "to be or not to be, that is the question" The examples below or simple term search (no booleans, no phrase, no fields, etc.) 1) Documents that contain all or most of the search terms are ranked highest. hit #1: ... ... to be or not to be, that is the question ... ... hit #2: ... ... to be, that is the question ... ... hit #3: ... ... is the question ... ... 2) Documents that contain all or most of the search terms, more often than other documents are ranked higher. hit #1: ... ... to be or not to be, that is the question and is still the question ... ... hit #2: ... ... to be or not to be, that is the question ... ... hit #3: ... ... to be, that is the question ... ... 3) Documents that contain the search terms closer to each other are ranked higher hit #1: ... ... to be or not to be, that is the question ... ... hit #2: ... ... to be or not to be, is what being asked, that is the question ... ... hit #3: ... ... is the question ... ... 4) Documents that contain the exact search terms, including number of times search terms occur, the smaller document is ranked higher hit #1: to be or not to be, that is the question hit #2: ... ... to be or not to be, that is the question ... ... 5) Documents that contain more of the complex / longer terms are ranked higher than those containing more of the lighter terms. hit #1: ... ... to be or not to be, that is the question and is still the question to question ... ... hit #2: ... ... to be or not to be and to be or not to be, and to be or not to be, that is the question ... ... 6) Documents that contain search terms, match the order, are ranked higher: hit #1: ... ... to be or not to be, that is the question ... ... hit #2: ... ... question the that is be not to be or be ... ... I think I get all the above right (I'm not sure about #6). Thanks Steven