Hi, Maybe I have missunderstood the general concept of how search results should be scored in regards to the fieldNorm, but the way i see it it causes an irritating effect of the sort order for me.
Here's the deal: I'm building a simple site with documents that represents ideas. Each idea can be active or inactive. Our search page have a simple textfield for search text input. Other then that, the only thing the user can influence is whether to search on all ideas, or only active ones. The problem is that if the search for all ideas only had active ideas in the result, the sort order can change if the user then wants to do the same search but for only active ideas. Example: A search for "betyg", where the user doesn't care if the ideas are active or inactive, gives this result: document-153 document-244 The user then checkes the checkbox "Only active ideas", and clicks the search button again. Now the result is: document-244 document-153 When I turned on debug mode for the lucene part of the 3rd party CMS, I saw the queries that lucene got: The first query: +type:idea +alltext:betyg The second query: +type:idea +(+alltext:betyg +category:14) (The category 14 represents the status Active.) I started Luke, and did the same searches there, and got the same result there (the results sort order of the first search was the reverse of the results sort order of the second search). I then clicked the "Explain" button for each document. There I found that all nodes had the same value for both documents, except for the last one, the fieldNorm for the field category. I then did a quick google search for this fieldNorm, and found this: http://www.mail-archive.com/[EMAIL PROTECTED]/msg06275.html so the fieldNorm is the product of the field boost for the document and the lengthNorm for the field in the document. I am pretty sure that the boost is the same for both documents, so that leaves only the lengthNorm. And according to the javadoc for the Similarity class, the lengthNorm value depends on the number of tokens in the field for the particular document. And now the strange behaivor makes sence, because the document 153 has a total of 6 different tokens for the category field, and the document 244 has only 5. But in this case, this behaivor is not really what I want. Do you have any suggestions on how to solve this? Is it possible to disable the lengthNorm calculation for particular fields? Regards /Jimi mogul | jimi hullegård | system developer | hudiksvallsgatan 4, 113 30 stockholm sweden | +46 8 506 66 172 | +46 765 27 19 55 | [EMAIL PROTECTED] | www.mogul.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]