[ 
https://issues.apache.org/jira/browse/LUCENE-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061332#comment-13061332
 ] 

Robert Muir commented on LUCENE-2392:
-------------------------------------

I think we need to commit the refactoring portions (separating TF-IDF out) to 
trunk very soon, because its really difficult
to keep this branch in sync with trunk, e.g. lots of activity and refactoring 
going on.

I'd like to get this merged in as quickly as possible. I don't think the svn 
history is interesting, especially given
all the frustrations I am having with merging... The easiest way will be to 
commit a patch, I'll get everything in shape
and upload one soon, like, today.


> Enable flexible scoring
> -----------------------
>
>                 Key: LUCENE-2392
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2392
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: flexscoring branch
>
>         Attachments: LUCENE-2392.patch, LUCENE-2392.patch, LUCENE-2392.patch, 
> LUCENE-2392_take2.patch
>
>
> This is a first step (nowhere near committable!), implementing the
> design iterated to in the recent "Baby steps towards making Lucene's
> scoring more flexible" java-dev thread.
> The idea is (if you turn it on for your Field; it's off by default) to
> store full stats in the index, into a new _X.sts file, per doc (X
> field) in the index.
> And then have FieldSimilarityProvider impls that compute doc's boost
> bytes (norms) from these stats.
> The patch is able to index the stats, merge them when segments are
> merged, and provides an iterator-only API.  It also has starting point
> for per-field Sims that use the stats iterator API to compute boost
> bytes.  But it's not at all tied into actual searching!  There's still
> tons left to do, eg, how does one configure via Field/FieldType which
> stats one wants indexed.
> All tests pass, and I added one new TestStats unit test.
> The stats I record now are:
>   - field's boost
>   - field's unique term count (a b c a a b --> 3)
>   - field's total term count (a b c a a b --> 6)
>   - total term count per-term (sum of total term count for all docs
>     that have this term)
> Still need at least the total term count for each field.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to