lucene scoring

2008-08-07 Thread Александр Аристов
Hi people What is the best way to implement scoring so that it become possible to compare scores obtained from different queries. Full problem description is here (clear and short) http://thread.gmane.org/gmane.comp.jakarta.lucene.user/10760/focus=10810 I know about possible usage of

[jira] Commented: (LUCENE-1350) SnowballFilter resets the payload

2008-08-07 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12620521#action_12620521 ] Doron Cohen commented on LUCENE-1350: - {quote} I think that you should expand this

[jira] Updated: (LUCENE-1350) Filters which are consumers should not reset the payload or flags and should better reuse the token

2008-08-07 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen updated LUCENE-1350: Description: Passing tokens with payloads through SnowballFilter results in tokens with no

Extending TopDocCollector

2008-08-07 Thread Shai Erera
Hi Is it possible to change TopDocCollector members to 'protected' instead of 'package' and 'private'? It will allow to easily extend it. Today, I have a need to extend it, but since I cannot use its members I have to implement the getTotalHits() and topDocs() exactly the same as TopDocCollector.

Re: lucene scoring

2008-08-07 Thread Grant Ingersoll
My understanding is this is an area of research in Info. Retrieval in general. There is some attempt at this with the query normalization factor in the scoring model, but my understanding is one shouldn't rely on it. You might try searching Google scholar (or MSN Academic Live, which I

[jira] Created: (LUCENE-1352) trailing escaped backslashes in quoted queries cause parse error

2008-08-07 Thread Michael Dodsworth (JIRA)
trailing escaped backslashes in quoted queries cause parse error Key: LUCENE-1352 URL: https://issues.apache.org/jira/browse/LUCENE-1352 Project: Lucene - Java Issue Type: Bug

[jira] Updated: (LUCENE-1352) trailing escaped backslashes in quoted queries cause parse error

2008-08-07 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated LUCENE-1352: - Description: {noformat} The QueryParser fails to parse queries that contain escaped

Re[2]: lucene scoring

2008-08-07 Thread Александр Аристов
I want implement searching with ability to set so-called a confidence level below which I would treat documents as garbage. I cannot defile the level per query as the level should be relevant for all documents. With current scoring implementation the level would mean nothing. I don't believe

[jira] Commented: (LUCENE-1335) Correctly handle concurrent calls to addIndexes, optimize, commit

2008-08-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12620723#action_12620723 ] Michael McCandless commented on LUCENE-1335: Thanks, Yonik. I'll add a

[jira] Updated: (LUCENE-1335) Correctly handle concurrent calls to addIndexes, optimize, commit

2008-08-07 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1335: --- Attachment: LUCENE-1335.patch Improved comments in expungeDeletes

Re: Re[2]: lucene scoring

2008-08-07 Thread Grant Ingersoll
On Aug 7, 2008, at 3:05 PM, Александр Аристов wrote: I want implement searching with ability to set so-called a confidence level below which I would treat documents as garbage. I cannot defile the level per query as the level should be relevant for all documents. With current scoring

Re: lucene scoring

2008-08-07 Thread Andrzej Bialecki
Александр Аристов wrote: I want implement searching with ability to set so-called a confidence level below which I would treat documents as garbage. I cannot defile the level per query as the level should be relevant for all documents. Hmm .. I'm not sure if I understand it properly - if the