18 apr 2006 kl. 11.45 skrev Danilo Cicognani:

Following is the code we are using now: we was considering the possiblity to have more informations from Lucene (for example the maximum term frequency
in one document) to optimized the calculations.
The first method is the one that start the calculation of Tf/Idf using the
class TTfIdf whose constructor is reported below.

                for(int i=0;i<l;i++){        // CAN BE OPTIMIZED IN SOME WAY?
                        if(freqs[i]>maxfreq) maxfreq=freqs[i];
                }
                this.freqs=new double[l];
                double tf;
                double idf;
                for(int i=0;i<l;i++){        // CAN BE OPTIMIZED IN SOME WAY?
                        tf=(double)freqs[i]/(double)maxfreq;
                        idf=Math.log((double)docs/(double)df[i]);
                        this.freqs[i]=tf*idf;
                }

Not quite sure what you do above, but I guess you could caclulate the information at index time. To persist it in the index, extend/hack TermFreqVector and related IO-classes.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to