On Mon, 2 Jun 2008, Cloud Zhang wrote:

Adding an new analyzer (in jar form) in Java is really straightforward, but
when I was trying to add one for pyLucene, I found no way to refer the jar
package.

I went though the building process of pyLucene and guess maybe I could:
* put the analyzer source under
PyLucene-2.3.2-1/lucene-java-2.3.2/contrib/analyzers/src/java/, and
recompile Lucene then pyLucene
or
* put the analyzer jar somewhere in the building folder and add it to the
Makefile, then recompile pyLucene

Could them work? Or is there other solution which is as straightforward as
setting CLASSPATH in java?

To access your class(es) by name from Python, you must have JCC generate wrappers for it (them). This is what is done line 177 and on in PyLucene's Makefile. The easiest way for you to add your own Java classes to PyLucene is to create another jar file with your own analyzer classes and code and add it to the JCC invocation there.

For example, the Makefile snippet in question currently says:

GENERATE=$(JCC) $(foreach jar,$(JARS),--jar $(jar)) \
           --package java.lang java.lang.System \
                               java.lang.Runtime \
           --package java.util \
                     java.text.SimpleDateFormat \
           --package java.io java.io.StringReader \
                             java.io.InputStreamReader \
                             java.io.FileInputStream \
           --exclude org.apache.lucene.queryParser.Token \
           --exclude org.apache.lucene.queryParser.TokenMgrError \
           --exclude org.apache.lucene.queryParser.QueryParserTokenManager \
           --exclude org.apache.lucene.queryParser.ParseException \
           --python lucene \
--mapping org.apache.lucene.document.Document 'get:(Ljava/lang/String;)Ljava/lang/String;' \ --mapping java.util.Properties 'getProperty:(Ljava/lang/String;)Ljava/lang/String;' \ --sequence org.apache.lucene.search.Hits 'length:()I' 'doc:(I)Lorg/apache/lucene/document/Document;' \
           --version $(LUCENE_VER) \
           --files $(NUM_FILES)


change the first line to say:

GENERATE=$(JCC) $(foreach jar,$(JARS),--jar $(jar)) --jar myjar.jar \
   ...

and rebuild PyLucene. That should be all you need to do. Your jar file is going to be installed along with lucene's in the lucene egg and it is going to be put on lucene.CLASSPATH which you use with lucene.initVM().

Your classes can be declared in any Java package you want. Just make sure that their names don't clash with other Lucene class names that you also need to use as the class namespace is flattened in PyLucene.

For more information about JCC and its command line args see JCC's README file at [1].

Andi..

[1] http://svn.osafoundation.org/pylucene/trunk/jcc/jcc/README
_______________________________________________
pylucene-dev mailing list
pylucene-dev@osafoundation.org
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to