On 24 Feb 2006, at 15:40, Nick Arnett wrote:

I'm about to start some work exploring term frequencies and other index features... Wondering if anyone on this list has done much along those lines, who might have some suggestions about what works or doesn't.

I've done a bit with PyLucene and term frequencies, to make the query expansion described at <http://hublog.hubmed.org/archives/001304.html>.

The code was something like:

for i in range(0, max):
        j = hits.id(i)
        tfv = reader.getTermFreqVector(j, 'fieldname');
        if (tfv):
            terms = tfv.getTerms();
            freqs = tfv.getTermFrequencies();
            for k in range(0, len(terms)):
                term = terms[k]
                freq = freq[k]

and then for each unique term:

t = Term('fieldname', term)
df = reader.docFreq(t)

alf.
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to