Andi Vajda <va...@apache.org> wrote: > On Sun, 22 Feb 2009, Bill Janssen wrote: > > > OK, I added JavaSet to my codebase. > > You can now use the one in the PyLucene's collections module: > > >>> from lucene.collections import JavaSet > > > But still no joy -- I can now call > > > > mlt = MoreLikeThis(...) > > mlt.setStopWords(JavaSet(set(["foo", "bar", "bletch"]))) > > terms = mlt.retrieveInterestingTerms(...) > > > > Unfortunately, I still get "foo", "bar", and "bletch" in the terms. > > > > So something is still off. Looking at the code for use of stopwords in > > MLT, it's pretty simple and looks correct. > > Well, that might be a different issue altogether. I believe PyLucene > comes with a sample illustrating MoreLikeThis functionality (in the > Lucene in Action sample set). Maybe there are hints there as to what > might be wrong. > > I verified that a JavaSet instance is a valid java.util.Set: > > >>> from lucene import * > >>> from lucene.collections import JavaSet > >>> initVM(CLASSPATH) > >>> a = JavaSet(set(['foo', 'bar', 'baz'])) > >>> b = HashSet(a) > >>> a > <JavaSet: org.apache.pylucene.util.python...@cd4544> > >>> b > <HashSet: [foo, baz, bar]> > >>> list(a) > ['baz', 'foo', 'bar']
The only thing I can think of is that the String instance being passed by Java isn't being handled by the "contains" method of JavaSet properly. I'll see if I can verify that. Bill