Here's the results of some tests using David's "more like.." class.Thanks for testing.
http://home.clara.net/markharwood/lucene/mlt.htm
Looks useful.
I have a couple of suggestions in the review.Your text copied here and my comments:
> Overall, a pretty useful class. I'm not sure if the "stop words" stuff is useful in the class - most systems > have these defined in the analyzer.
Good point, I was following the orig mail suggesting this algorithm and didn't think it thru.
Should be able to set the table to null to disable this in my class.
> Also I think the "defaults" for minTermFreq should be dynamic and adjust subject to the example text > length. If its short there is likely to be little in the way of word repetition.
Good point, adaptive "AI" would make sense here.
> I would also suggest making the parameters have getters/setters rather than exposed public fields.
Yeah, for consistency this is probably true. I'm tired of typing in obvious 1 line getter/setters
that no real point ( though I have been doing this for years) so left 'em out.
thx, Dave
Cheers Mark
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
