Re: sorting by per doc hit count

2006-12-20 Thread Chris Hostetter
: : problem reamins that I would like to be able to switch between the hits : per doc Similarity and the default Similarity on any given search. I : was hoping that I could index with DefaultSimilarity and store the norms : for normal relevancy searching. Then I would need to ignore or make :

Re: sorting by per doc hit count

2006-12-20 Thread Mark Miller
Mr. Hostetter, you are a Godsend. Just wanted to report to anyone following this thread that Hoss's solution was perfect and I indeed was able to add a new dynamically changeable Term frequency relevance scoring system. The value of such a thing may not be high, but man do I love Lucene for

Re: sorting by per doc hit count

2006-12-20 Thread Chris Hostetter
: this thread that Hoss's solution was perfect and I indeed was able to add a : new dynamically changeable Term frequency relevance scoring system. The Cool ... PINE is my IDE. -Hoss - To unsubscribe, e-mail: [EMAIL

Re: sorting by per doc hit count

2006-12-19 Thread Mark Miller
Could I use another Similarity that returned 1 for most of the scoring terms and the actual term frequency (rig the equation)? Could I then alternate the DefaultSimilarity and HitsPerDocSimilairty per search? LIA mentioned something about needing to rebuild the index if you change Similarity's.

Re: sorting by per doc hit count

2006-12-19 Thread Doron Cohen
Mark Miller [EMAIL PROTECTED] wrote on 19/12/2006 09:21:00: LIA mentioned something about needing to rebuild the index if you change Similarity's. That does not make sense to me yet. It would seem you could alternate them. What does scoring have to do with indexing? For this part of your

Re: sorting by per doc hit count

2006-12-19 Thread Mark Miller
Thanks for the tip Doron, What if I replace the decode static method in Similiarity so that it returns 1 always for the HitPerDocSimiliarity? This would not require a re-index right? Doron Cohen wrote: Mark Miller [EMAIL PROTECTED] wrote on 19/12/2006 09:21:00: LIA mentioned something

Re: sorting by per doc hit count

2006-12-19 Thread Mark Miller
Foolish me...override a static method...silly silly. Still, I think there must be some way. I don't care about the field normalization...there must be some way to make it return a constant 1 when using a new Similarity class. Doron Cohen wrote: Mark Miller [EMAIL PROTECTED] wrote on

Re: sorting by per doc hit count

2006-12-19 Thread Chris Hostetter
: Foolish me...override a static method...silly silly. Still, I think : there must be some way. I don't care about the field : normalization...there must be some way to make it return a constant 1 : when using a new Similarity class. as discussed: norms are a value explicitly stored in your

Re: sorting by per doc hit count

2006-12-19 Thread Mark Miller
I appreciate your help Hoss. That has cleared up some things for me. The problem reamins that I would like to be able to switch between the hits per doc Similarity and the default Similarity on any given search. I was hoping that I could index with DefaultSimilarity and store the norms for

sorting by per doc hit count

2006-12-16 Thread Mark Miller
I have not really looked into this yet, but maybe you can save me some time -- Is it feasible/simple to sort by the number of hits found per document? Would this require changing the scoring system (remove idf etc etc) and doing a normal relevancy search? Could it be done with functionquery? Any

Re: sorting by per doc hit count

2006-12-16 Thread Erick Erickson
Well, if you're not interested in doing much in the way of complex queries, you could use TermDocs/TermEnum (particularly look at TermDocs) to count the number of times a term appears in each document. I think you'll be surprised at how quickly you can get this info. Making your own scorer seems