On Tue, 17 Jun 2008, Felix Schwarz wrote:
Just a side note: I built RPMs for CentOS/Fedora which compile on CentOS 5
(i386, x86_64) and Fedora 9 using OpenJDK.
These RPMs do work well for me and I plan to submit them to Fedora but
currently the rpath issue is the main blocker for inclusion.
Currently I don't have the time to patch JCC to use only dlopen (which is the
only accepted method besides waiting ~ 1/2 year so that OpenJDK /may/ get
fixed):
http://thread.gmane.org/gmane.linux.redhat.fedora.devel/85542/focus=85548
Thank you for posting this thread, it was interesting reading.
The 'linking against the JRE' issue has been a mess. It's different on every
platform (client vs server VM), every OS (only Mac OS X has it right at this
time, linking against the system's JavaVM framework is done the same way on
each and every OS X Mac), and even every Linux distro. No two OSs have the
same path names or layouts for their JRE distro. The variety is quite
impressive, it's almost as if this was done on purpose :)
Because of that, I've basically punted the automating of figuring out what
the right paths are for each and every OS or Linux distro under the sun.
Python's distutils/setuptools is already doing a _very_ good job at invoking
the C++ compiler and linker with all the right flags. One just has to get
the paths rights (and the flags right, when doing a new port, such as
for Solaris most recently).
It's been my experience so far that, since moving off of GCJ, most issues
with building JCC-PyLucene have been with putting the right include and link
paths into JCC's setup.py file. This requires an understanding of C++
compilers, linkers and shared libraries which is typically not needed or
expected by software developers doing only Python or only Java development.
My sincere apologies to these developers.
That being said, I'd like to point out that PyLucene has come a long way
since its beginnings in 2004. It's become vastly easier to build and use.
It's also gotten quite stable.
PyLucene started as a bet that something like this could be done with GCJ
and SWIG on Mac OS X, Linux and Windows. The bet proved workable for a while
until SWIG and GCJ kept making it harder and harder.
SWIG, for instance, does not understand the concept of
Java-masquerading-as-C++ that GCJ offers via CNI (which so much better than
JNI and sorely missed). It became increasingly hard to keep up with its
changes. The kludges I had to implement to get PyLucene to work had to
change after every SWIG release. Eventually, I gave up and rewrote all
the wrappers by hand (same was done for PyICU).
The GCJ part of the bet worked longer. Even though GCJ was never able to
compile Java Lucene from sources without heavy patching of the Lucene Java
sources and therefore required a JDK to build the Lucene sources, I was able
to use GCJ for quite some years. The combination of the announcement of
OpenJDK - and the corresponding drop of activity on the GCJ project
(measured by the drop in traffic on the GCJ developer mailing list) and the
worsening of support of GCJ 4.x on non-Linux (even more specifically
non-Redhat) platforms eventually convinced me to give up on GCJ/CNI as well
and move to a plain C++/JNI/JRE solution by writing a code generator that
more or less replaces what is offered by GCJ's CNI.
I say 'more' because the generated C++ code and JCC take care of memory
managing the objects returned by Java - CNI does not - and I say 'less'
because CNI makes true C++ classes out of Java classes - JCC
only generates C++ wrapper proxies. Last but not least, JCC has no knowledge
of Java Lucene, it can be used with any Java code, just like CNI.
About the "use Jython instead, JCC is terrifying" point made in the thread
you linked to:
- There is nothing terrifying about JCC. It's just a C++ wrapper around
a JVM. Yes, that may sound tricky but JCC is a code generator and the
code it generates is quite legible, much more so than SWIG's, for
example.
I'd also argue that JCC is less terrifying than GCJ, especially on
non Linux platforms. Ever built GCJ for Windows :) ?
- The Java GC and Python GC are aware of each other in so far that
neither will free an object held by the other runtime, even when in a
circular reference situation as happens when one 'extends' a Java class
with a Python class. Python's ref counting, JCC's proxies and Java's
weak refs collaborate to make this work.
- There is a big trade-off with using Jython, apart from the usual
keeping-up-with-CPython point: when using Jython, you're a prisoner of
the Java island again. Java very much wants to you to use Java and Java
alone. While calling out of a Java VM via JNI is possible it's not
part of the culture, or the way of doing things in the Java world. If
everything you need for your project is available from the Java
and Jython ecosystems then using Jython can be a very good choice. If
that is not the case, the CPython/JCC/C++/JNI/JRE solution gives you
the best of both worlds, you are not a prisoner of either.
On a related note, in the next few weeks I intend to improve JCC so that
it can work "in reverse". In other words, generate the code necessary to
start from a Java process (instead of a CPython process) and invoke
CPython code. This would make it possible to use CPython from Hadoop,
for example. Adding support for Java 1.5 generics is also planned.
Further reading:
https://bugzilla.redhat.com/show_bug.cgi?id=449360
http://article.gmane.org/gmane.linux.redhat.fedora.devel/85542
Just in case anyone is interested, I can publish my RPMs somewhere.
If you send a message to this list detailing what distro(s) your RPMs can be
used with, what version of Python, Lucene, Java, JCC and PyLucene they're
built from and where they're hosted, I'll add a link to that message on
PyLucene's homepage.
Thanks !
Andi..
_______________________________________________
pylucene-dev mailing list
pylucene-dev@osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev