Hi every body
I know it may seem stupid, but I'm in the middle of a research and I need a
piece of code in lucene to give me a weight matrix of a text collection and a
given query:
W i,j = (f i,j)x(idf i)
AND for the query:
W i,q = (0.5 + (0.5xfreq i,q)/Max(freq i,q))x (idf i )
where:
f i,j = Normilize frequency = freq i,j / Max(freq j)
freq i,j = frequency of (k i) in document j (d j)
idf i= log(N/(n i)) idf= Inverse Document Frequency
N = total number of documents in the collection
n i = number of documents which has the TERM i (k i)
could any body help?
Many thanks in advance
best wishes to all