On Sun, 22 Feb 2009, Andi Vajda wrote:
On Sun, 22 Feb 2009, Bill Janssen wrote:
I'm probably missing something incredibly obvious here...
I'm trying to call MoreLikethis.setStopWords(Set words). I've got a
list of stop words in Python, but I can't figure out how to turn that
into a Java Set. I tried "lucene.HashSet(set(words)",
"lucene.HashSet(lucene.ArrayList(JArray("string")(words)))", and so
forth, without much luck.
PyLucene doesn't wrap the java.util.Arrays class that fills in the Java gap
between arrays and collections. That should be considered an oversight of
mine. I should add it to the JCC invocation in PyLucene's Makefile. Then you
would be able pass your JArray instance to Arrays toList() method to make an
ArrayList and finally feed that to a HashSet.
Another alternative is to implement a Python extension of the Java Set
interface. Guess what ? that is already part of PyLucene. The PythonSet class
is the extension point for implementing a Java Set in Python and that is part
of the PyLucene distribution.
I even have such a Python implementation of a Java Set, called JavaSet.py,
ready here but it's not currently shipping with PyLucene, another oversight
of mine. I should add it to the distribution.
Until then, here it is below. It takes a python set instance as constructor
argument and implements the complete Java Set interface. This example also
illustrates a Python implementation of the Java Iterator interface.
I added a collections.py module to the PyLucene distribution.
To use it:
>>> from lucene.collections import JavaSet
>>> from lucene import initVM, CLASSPATH
>>> initVM(CLASSPATH)
>>> a = JavaSet(set(['foo', 'bar', 'baz']))
I also added some missing proxies for the mapping and sequence protocols so
that JavaSet can be iterated and used with the 'in' operator from Python.
Andi..