On 24 Feb 2006, at 15:40, Nick Arnett wrote:
I'm about to start some work exploring term frequencies and other
index features... Wondering if anyone on this list has done much
along those lines, who might have some suggestions about what works
or doesn't.
I've done a bit with PyLucene and term frequencies, to make the query
expansion described at <http://hublog.hubmed.org/archives/001304.html>.
The code was something like:
for i in range(0, max):
j = hits.id(i)
tfv = reader.getTermFreqVector(j, 'fieldname');
if (tfv):
terms = tfv.getTerms();
freqs = tfv.getTermFrequencies();
for k in range(0, len(terms)):
term = terms[k]
freq = freq[k]
and then for each unique term:
t = Term('fieldname', term)
df = reader.docFreq(t)
alf.
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev