Re: HighFreqTerms patch

2011-02-10 Thread Pablo Mendes
6:16 PM, Michael McCandless > wrote: > > Hmm, which version of Lucene are you using? Newer versions let you > > specify a field... > > > > Mike > > > > On Wed, Feb 9, 2011 at 12:06 PM, Pablo Mendes > wrote: > >> Guys, > >> this is tiny and pro

HighFreqTerms patch

2011-02-09 Thread Pablo Mendes
Guys, this is tiny and probably not relevant. But I'll bet a beer that at least a dozen people had to dirtymod this class while they could have run it from command line. A 15 min time save that took 15 min to create. I guess it's a tie. Best, Pablo --- HighFreqTerms.java +++ ExtractStopwords.java

Re: Scaling Lucene to 1bln docs

2010-08-10 Thread Pablo Mendes
Shelly, Do you mind sharing with the list the final settings you used for your best results? Cheers, Pablo On Tue, Aug 10, 2010 at 3:49 PM, anshum.gu...@naukri.com wrote: > Hey Shelly, > If you want to get more info on lucene, I'd recommend you get a copy of > lucene in action 2nd Ed. It'll help

Modifying idf()?

2010-07-30 Thread Pablo Mendes
Hi all, I'd like to do a very simple change to the idf computation, but I can't seem to wrap my head around it. There are very useful hints in the javadocs for "Changing Similarity" for new tf() and lengthNorm() behavior, but it was a little bit blurrier for idf() http://lucene.apache.org/java/3_0

IndexWriter.mergeDocument(Term term, Document doc)

2010-06-29 Thread Pablo Mendes
Hi all, I'm looking for a functionality similar to IndexWriter.updateDocument() *IndexWriter.**updateDocument *(Term