Hi Thomas,
On Mon, 30 Apr 2012, Thomas Koch wrote:
I again had a look at the patch I submitted recently and would like to get
back to it. An updated version of the patch is attached to this email -
the patch is against the branch_3x repo
http://svn.apache.org/repos/asf/lucene/pylucene/branches/branch_3x
Thank you for sending this, I just got back from vacation and was going to
ask you about this as I'd like to get the PyLucene 3.6 release out soon - if
possible with you patch.
The patch mainly
- adds two java classes: PythonList, PythonListIterator
- adds according Python classes (JavaListIterator and JavaList in
collections.py)
Purpose:
- provide a Java-based List implementation in JCC/PyLucene (similar to existing
PythonSet/JavaSet)
- allow to pass python lists via Java Collections into PyLucene
Let's try summarize shortly: PythonSet /JavaSet was already existing, but
nothing similar for Lists. I made an implementation of PythonList
/JavaList and with your help this is now basically working. Except of an
open issue that affects both JavaSet and JavaList: initialization of an
ArrayList with a JavaSet (or JavaList) may cause trouble.
As you said: "There is a bug somewhere with constructing an ArrayList from
a python collection like JavaSet or JavaList."
I tried to change the toArray() method as you suggested, but that didn't
help. As far as I understood, there are two options to box python values
into a typed JArray:
1) use the object based JArray class and box python values by wrapping
them with the corresponding Java object (e.g. type<int> ->
lucene.Integer):
x = lucene.JArray('object')([lucene.Boolean(True),lucene.Boolean(False)])
JArray<object>[<Object: true>, <Object: false>]
type(x[0])
<type 'Object'>
2) use the correct array type (int, float, etc.) and pass the list of
Python elements or literals) to the JArray constructur, e.g.
y = lucene.JArray('bool')([True,False])
JArray<bool>[True, False]
type(y[0])
<type 'bool'>
I tried both of them (see _pyList2JArray methods in collections.py) but
none of them did the trick. Actually the 'empty objects in ArrayList'
problem remains when handling with strings (the ArrayList object that is
initialized with a JavaSet or JavaList of string items will have a number
of objects as the original JavaSet/JavaList, but all objects are the same
- ooks like an array of empty objects). Furthermore another issue with
integer lists comes into play: here the initialization of ArrayList with
the Collection fails with a Java stacktrace (lucene.JavaError:
org.apache.jcc.PythonException).
The most simple test case is as follows:
--%< --
import lucene
lucene.initVM()
from lucene.collections import JavaList
# using strings: the ArrayList is created, but initialized with empty objects
jl = JavaList(['a','b'])
al = lucene.ArrayList(jl)
assert (not al.get(0).equals(al.get(1))), "unique values"
# using ints: the ArrayList is not created, but an error occurs instead:
# Java stacktrace: org.apache.jcc.PythonException: ('while calling toArray')
jl = JavaList(range(3))
al = lucene.ArrayList(jl)
--%< --
I currently feel like having to stab around in the dark to find out what's
going on here and would welcome any suggestions. Needs some JCC expert I
guess ,-)
Of course we can leave the patch out - but still there's the same issue
with JavaSet.
I'd like to get to the bottom of this before the 3.6 release. It's a matter
of my finding the time this week.
Thanks for reviving this !
Andi..
kind regards
Thomas
--
OrbiTeam Software GmbH & Co. KG, Germany
http://www.orbiteam.de
-----Ursprüngliche Nachricht-----
Von: Andi Vajda [mailto:[email protected]]
Gesendet: Mittwoch, 18. April 2012 20:37
An: [email protected]
Betreff: Re: AW: PyLucene use JCC shared object by default
Hi Thomas,
...
Lucene 3.6 just got released a few days ago. Apart from your patch, the
PyLucene 3.6 release is ready. I'm about to go offline (email only) for a week.
Let's revisit this patch then (first week of May). It's not blocking the release
right now as, even if I sent out a release candidate for a vote, the three
business days required for this would take this into the time I'm away.
...
Andi..