-----Ursprüngliche Nachricht-----
Von: Andi Vajda [mailto:va...@apache.org]
Gesendet: Donnerstag, 3. Mai 2012 01:00
An: pylucene-dev@lucene.apache.org
Betreff: Re: AW: AW: AW: PyLucene use JCC shared object by default
Hi Thomas,
On Wed, 2 May 2012, Thomas Koch wrote:
could you download the patch from the link?
Yes, I got your patch just fine.
I fixed a few bugs today having to do with converting sequences to JArray
and added support for auto-boxing primitive types when converting a
sequence to an object JArray. Now your collections-demo.py all works fine !
With these fixes the Python toArray() methods can return a Python
sequence object directly, there is no need to do the JArray conversion in
Python anymore.
I simplified the collections.py file a bit to reflect the fixes and all changes,
including the PythonList/PythonListIterator code is now checked in.
Could you please convert collections-demo.py into a proper unit test module
like the unit tests in pylucene/test so that it gets integrated into the test
suite ?
Thanks !
Andi..
Just one more thing ... in the initial implementation of PythonList I did the
toArray() method in Python and the toArray(Object[]) method in Java - just
as was done for the PythonSet:
+ public native List subList(int fromIndex, int toIndex);
+ public native Object[] toArray();
+
+ public Object[] toArray(Object[] a)
+ {
+ Object[] array = toArray();
+
+ if (a.length < array.length)
+ a = (Object[])
Array.newInstance(a.getClass().getComponentType(),
+ array.length);
+
+ System.arraycopy(array, 0, a, 0, array.length);
+
+ return a;
+ }
(from patch of Feb 22nd I sent to you)
From the current patch you can see that the latter part is missing now -
the toArray(Object[]) method is now done in Python as well, i.e. it simply
calls toArray():
==========================================================
=========
--- java/org/apache/pylucene/util/PythonSet.java (revision 1332162)
+++ java/org/apache/pylucene/util/PythonSet.java (working copy)
@@ -62,14 +62,6 @@
public Object[] toArray(Object[] a)
{
- Object[] array = toArray();
-
- if (a.length < array.length)
- a = (Object[]) Array.newInstance(a.getClass().getComponentType(),
- array.length);
-
- System.arraycopy(array, 0, a, 0, array.length);
-
- return a;
+ return toArray();
}
}
As far as I remember that was part of your changes in between (I probably
never touched PythonSet). Anyway I could imagine that this is related to the
current problem.
However, the ArrayList never calls the 2nd toArray method. The ArrayList
constructor actually triggers the "simple" toArray method:
/**
* Constructs a list containing the elements of the specified
* collection, in the order they are returned by the collection's
* iterator.
*
* @param c the collection whose elements are to be placed into this list
* @throws NullPointerException if the specified collection is null
*/
public ArrayList(Collection<? extends E> c) {
elementData = c.toArray();
size = elementData.length;
// c.toArray might (incorrectly) not return Object[] (see 6260652)
if (elementData.getClass() != Object[].class)
elementData = Arrays.copyOf(elementData, size, Object[].class);
}
The ArrayList source code is attached (from OpenJDK6 sources).
So maybe that's the wrong path ... Anyhow I feel the mapping of
toArray(Object[]) to toArray() does not fully comply with the Java API
description:
http://docs.oracle.com/javase/6/docs/api/java/util/List.html#toArray(T
[])
Hope that helps...
Regards,
Thomas
-----Ursprüngliche Nachricht-----
Von: Andi Vajda [mailto:va...@apache.org]
Gesendet: Montag, 30. April 2012 19:32
An: pylucene-dev@lucene.apache.org
Betreff: Re: AW: AW: PyLucene use JCC shared object by default
On Mon, 30 Apr 2012, Thomas Koch wrote:
Dear Andi, I again had a look at the patch I submitted recently and
would like to get back to it. An updated version of the patch is
attached to this email - the patch is against the branch_3x repo
http://svn.apache.org/repos/asf/lucene/pylucene/branches/branch_3x
Oh, and there is no attachment in your email. Maybe it got eaten up
by some mail server. Please, make sure it's of a text mimetype or
mail it to me directly.
Thanks !
Andi..
The patch mainly
- adds two java classes: PythonList, PythonListIterator
- adds according Python classes (JavaListIterator and JavaList in
collections.py)
Purpose:
- provide a Java-based List implementation in JCC/PyLucene (similar
to existing PythonSet/JavaSet)
- allow to pass python lists via Java Collections into PyLucene
Let's try summarize shortly: PythonSet /JavaSet was already
existing, but
nothing similar for Lists. I made an implementation of PythonList
/JavaList and with your help this is now basically working. Except of
an open issue that affects both JavaSet and JavaList: initialization
of an ArrayList with a JavaSet (or JavaList) may cause trouble.
As you said: "There is a bug somewhere with constructing an
ArrayList from
a python collection like JavaSet or JavaList."
I tried to change the toArray() method as you suggested, but that
didn't
help. As far as I understood, there are two options to box python
values into a typed JArray:
1) use the object based JArray class and box python values by
wrapping
them with the corresponding Java object (e.g. type<int> ->
lucene.Integer):
x =
lucene.JArray('object')([lucene.Boolean(True),lucene.Boolean(Fals
e)
])
JArray<object>[<Object: true>, <Object: false>]
type(x[0])
<type 'Object'>
2) use the correct array type (int, float, etc.) and pass the list
of Python
elements or literals) to the JArray constructur, e.g.
y = lucene.JArray('bool')([True,False])
JArray<bool>[True, False]
type(y[0])
<type 'bool'>
I tried both of them (see _pyList2JArray methods in collections.py)
but
none of them did the trick. Actually the 'empty objects in ArrayList'
problem remains when handling with strings (the ArrayList object that
is initialized with a JavaSet or JavaList of string items will have a
number of objects as the original JavaSet/JavaList, but all objects
are the same - ooks like an array of empty objects). Furthermore another
issue with integer lists comes into play:
here the initialization of ArrayList with the Collection fails with
a Java stacktrace (lucene.JavaError: org.apache.jcc.PythonException).
The most simple test case is as follows:
--%< --
import lucene
lucene.initVM()
from lucene.collections import JavaList
# using strings: the ArrayList is created, but initialized with
empty objects jl = JavaList(['a','b']) al = lucene.ArrayList(jl)
assert (not al.get(0).equals(al.get(1))), "unique values"
# using ints: the ArrayList is not created, but an error occurs instead:
# Java stacktrace: org.apache.jcc.PythonException: ('while calling
toArray') jl = JavaList(range(3)) al = lucene.ArrayList(jl) --%< --
I currently feel like having to stab around in the dark to find out
what's going on here and would welcome any suggestions. Needs some
JCC
expert I guess ,-)
Of course we can leave the patch out - but still there's the same
issue with
JavaSet.
kind regards
Thomas
--
OrbiTeam Software GmbH & Co. KG, Germany http://www.orbiteam.de
-----Ursprüngliche Nachricht-----
Von: Andi Vajda [mailto:va...@apache.org]
Gesendet: Mittwoch, 18. April 2012 20:37
An: pylucene-dev@lucene.apache.org
Betreff: Re: AW: PyLucene use JCC shared object by default
Hi Thomas,
...
Lucene 3.6 just got released a few days ago. Apart from your patch,
the PyLucene 3.6 release is ready. I'm about to go offline (email
only) for a
week.
Let's revisit this patch then (first week of May). It's not
blocking the
release
right now as, even if I sent out a release candidate for a vote,
the three business days required for this would take this into the time
I'm away.
...
Andi..