|
hi,
i was wondering if anyone has tried implementing anything in lucene which uses the index to calculate similarities between page content (i.e. to calculate a score for how similar to page A page B is, rather than a score for page B compared to a query term)?
i suppose the IndexReader class would be useful (docFreq(), termDocs(), terms() etc), i just wanted to get an impression if anyone has actually tried this/how doable it is. as a new developer on the lucene project, some of it is a bit of a mystery to me so if i could get some sort of pointers, that would be great.
thanks,
maurice |