Steve Rowe and I have added scoring.xml (with some contributions from Karl Wettin, Chris Hostetter and others) to the xdocs directory (and scoring.html to the docs directory). Our goals in writing this document were:

1. To better understand scoring
2. To document how scoring works for the Lucene community, as well as document how to make changes to scoring for a specific application.
3. To kick start more documentation on scoring

I think we have achieved #1, which doesn't really benefit many others yet, as for #2, that remains to be seen. #3 is up to us to do.

To the end of achieving #2, I would appreciate it if other developers could take a look at http://lucene.apache.org/java/docs/scoring.html
and provide us feedback on any and all parts of this document.

Note, the above link is not yet hooked into the main menu system on the left hand side of the Lucene site. In a week or two, once we have some feedback and updates, my plan is to hook it into the projects.xml menu under the menu title "Scoring".

Specifically, we are interested in:

1. Errata, clarifications, improvements, additions of things that are useful. Where did we get the algorithms/descriptions wrong, where could it be made more clear? Some of the areas of particular interest are those highlighted in yellow. Additionally: a. Filling in the "Big Picture" section with lower level details on BooleanScorer2. Is this necessary?
        b. Other examples of changing Similarity
c. Examples of adding your own Query. It would be great to have a write up on the motivation behind SpanQuery or some of the other Query classes (other than TermQuery). Also would be great to have more on the semantics of what goes into implementing the various methods on Weight and Scorer d. Should there be more of a discussion about how Hits/Searchers/ Filters work? I purposely left these out b/c I wanted to focus on scoring, but these pieces do play a role in enabling scoring 2. Organizational suggestions -- i.e how could this document be better organized
3. Grammar, spelling
4. If anyone knows how to get the Greek Sigma character to pass through in Anakia (Velocity), the section on the scoring formula would be most appreciative. The usual Hex entity references don't seem to pass through correctly. I suspect there needs to be a change in the site.vsl but I don't know how to do it (there is also a Entity reference in systemproperties.xml that is not working correctly.)

As for goal #3, please feel free to add more insight into the scoring process, particularly if you can add value on the "why" question (i.e. why is scoring done this way.) This document is most likely just a start on documenting how scoring works.

As for changes, the best way is to submit a patch in JIRA (or just commit the changes, if you can). If not JIRA, then at least reply to this message.

-Grant

--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org

Voice: 315-443-5484
Skype: grant_ingersoll
Fax: 315-443-6886




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to