On Fri, 14 Dec 2007, Helmut Jarausch wrote:
following a suggestion on the Java-Lucene mailing list,
I tried to call Query.extractTerms
In Java it has the signature
void extractTerms(Set terms)
It puts all results in the Java-set terms
I tried
parser= QueryParser(...)
query = parser.parse('a string')
Set_of_Terms= set()
query.extractTerms(Set_of_Terms)
but this fails. The same is with
Set_of_Terms=[]
query.extractTerms(Set_of_Terms)
So, what Python type should be used?
JCC is not making a correspondance between a Python set and a Java Set.
In other words, when it sees a set object, it's not converting it to a
concrete Java Set subclass. This could be done, though, if one implemented a
Java Set subclass that wraps a Python set instance. In other words, a Python
extension of the Java Set abstract class implemented with a Python set.
It's not too hard to do. I might just add that to JCC at some point.
In the meantime, use a concrete Java Set subclass such as HashSet instead of
a Python set:
>>> from lucene import QueryParser, StandardAnalyzer, HashSet, Term
>>> q = QueryParser("fields", StandardAnalyzer()).parse("foo AND bar")
>>> terms = HashSet()
>>> q.extractTerms(terms)
<HashSet: [fields:foo, fields:bar]>
>>> for term in terms:
print term, type(term)
fields:foo <type 'Object'>
fields:bar <type 'Object'>
Notice how the type of the term instance is seen as Object by python. This
is because a HashSet contains Java Object instances so JCC generated code to
wrap the HashSet contents as Object instances and not what they actually are
(which it doesn't know at compile time).
To cast the field objects to Term, use the cast_() method as in:
list(terms)[0].cast_(Term)
To know what the actual java class of a wrapped object is, you can call
getClass() on it:
>>> list(terms)[0].getClass()
<Class: class org.apache.lucene.index.Term>
Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev