On 8/8/05, Erik Hatcher <[EMAIL PROTECTED]> wrote: > Thomas - have you had a look at PyLucene and how they do the gcj/SWIG > wizardry? What kinds of issues did you encounter with gcj? Perhaps > Andi Vajda from PyLucene could offer some advice? > > I'd rather see the gcj/SWIG approach moving forward so that SWIG > Lucene doesn't lag behind Java Lucene where all the innovation happens.
Yep, I tried to compile PyLucene on my Mac, but it failed because of the Python version that comes with Mac OS 10.4 (which is 2.3). To be fair to PyLucene, I only tried for a couple of hours as I don't really have an interest in Python, I actually only wanted to see how they use gcj. But aside from that, I tried the PyLucene way first for a whole week. First the issue of getting to run gcj on Mac OS X which ain't easy at all - I had to install darwinports with a fresh gcc. Getting gcj to run over Lucene is easy, works out of the box. But linking ruby with swig-wrapped gcj-compiled lucene is not, all I got is a gcj internal compiler error (with both gcc/gcj 3.4.3 and 4.0.1). This bug is in the gcc bug list marked as a regression. On Windows I had a similar amount of trouble using both MingW and cygwin; I wasn't able to compile & link the stuff against ruby. So to summarize, while there is definitely a strong argument for using gcj to create other-language bindings from the Java-version, there are a few issues that IMO make a strong case for CLucene: * at best gcj is difficult to use; but on Windows & MacOS it is quite involved and difficult. For me it was nearly impossible as I'm no gcc/gcj expert * it prevents or at least makes it extremely difficult to create certain bindings such as COM and C# (perhaps except mono) as MingW is not easily combined with VisualC++ AFAIK. And I don't think that there is any chance of debugging such a combination when a problem arises. * the amount of work necessary to swig-wrap the gcj-compiled Lucene to a given target language is immense - just have a look at the swig file of PyLucene and the Makefile to make the magic happen; I think this must be a nightmare to maintain. I cannot really tell what amount of work would be necessary for CLucene but since it is a straight C++ library and built with swig in mind, I would be surprised if it is not a lot less So from a technical point of view, it is my opinion that a pure C++ version is easier to maintain and evolve right now. I also think that most of the innovation in Lucene is not Java-specific so while it would be duplicated implementation work, the algorithms are the same (or near enough). Also, a pure C++ version of Lucene gives it more momentum IMO in both the Linux world (mbox_lucene or something similar comes to mind) and the Microsoft world (.Net etc.) > As for Lucene4C versus CLucene and moving CLucene to Apache - I'll > let the [EMAIL PROTECTED] list discuss it. I'm happy to have CLucene at > Apache too, though it seems simpler for us to only house a single > implementation in C. The gcj version would be ideal in my mind, but > I'm also not skilled in gcj (and haven't touched C in decades, > practically) - so it certainly is up to the actual coders where to go > with it. I don't know whether it is a "Lucene4C vs. CLucene" anyway. From what I understand Lucene4C tries to create a simpler API for Lucene, and while they are building on top of a gcj-compiled version of Java Lucene, that is likely not a requirement (I don't think that they want to expose any of the gcj-generated classes). Besides, CLucene is quite far so from a practical point of view it would make sense to use /maintain it. Being the practical guy that I am, I think that any issues between Lucene4C, PyLucene, CLucene can be worked out if the developers work together. After all, for all I know it might even be possible to use a mixture of the Lucene4C API (for plain C) and the CLucene API (for C++) in front of a gcj-compiled Java Lucene, and all SWIG wrappers could then be build on top of this API. At lest technically this is possible and perhaps even feasible. regards, Tom
