This is to announce the release of PyLucene 0.9.7.

This release's focus was getting almost of all the "Lucene in Action" Book's samples and tests ported to PyLucene. All of the ones that did not depend on third party Java libraries without a python equivalent and were not related to remote searching got ported. I'm pleased to report that all "Lucene in Action" unit tests pass.

Along the way, I fixed numerous bugs, plugged many API holes and added pythonic extensions to some Lucene classes such as Hits and IndexReader which, for example, are iterable as in:
for i, doc in hits:
... more stuff here ...
where i is the doc number and doc the current doc.


I also added support for indexing some non-plain text formats, such as html, xml, pdf and msword. For more information, see the samples/LuceneInAction/lia/handlingtypes python code tree.

Until there is documentation available, which is the next focus, the PyLucene.i file, available from the source distribution, can be used as a quick reference to find out what is available and how. In addition, the ported "Lucene in Action" python code provides lots of examples of Lucene and PyLucene usage.

I highly recommend a purchase of an electronic or printed copy of the "Lucene in Action" book. See http://www.manning.com/hatcher2 for more information.

It is my impression that, with the exception of remote searching, PyLucene is at this time very close to supporting all of Java Lucene's APIs. If you find something is missing or not extensible - yet it should be - please let me know.

The binaries available from http://pylucene.osafoundation.org were built on Mac OS X 10.3.7, Gentoo Linux 2004.3 and Windows 2000.

Below is a list of changes applied since last release.

Andi..

 - added support for TermPositionVector
 - upgraded Highlighter to latest in sandbox CVS (without TokenSources.java)
 - fixed bug in jsearchableArray type handler
 - added all search overloads to Searcher, SWIG needs them on the same class
 - fixed bug in PythonSearchable, renaming search overloads for python call
 - inverted the store patches from patches.store-4.3 to patches.store-4.2
 - added IndexWriter.optimize(yield) overload
 - added missing IndexWriter.addIndexes(IndexReader *) overload
 - added support for Lock, InputStream, OutputStream
 - added missing Directory methods
 - added python extension support to Directory, InputStream, OutputStream, Lock
 - added missing FSDirectory methods
 - added support for RAMOutputStream
 - Object.toString() wasn't properly inherited
 - improved type error reporting
 - added most missing Object methods
 - added support for Properties
 - added support for Process and most missing Runtime methods
 - added IndexWriter(jstring, janalyzer, jboolean) constructor
 - added HitsEnumeration to iterate over documents of Hits
 - made Hits more pythonic, added __iter__, __len__, __getitem__, __nonzero__
 - integrated br, cn, cjk, cz, fr and nl analyzers from sandbox
 - added support for System.out and System.err
 - added support for SimpleDateFormat
 - fixed bug in returning Locale objects
 - added TokenEnumeration to iterate over tokens of TokenStream
 - made TokenStream iterable
 - added support for LetterTokenizer, LowerCaseTokenizer
 - added IndexWriter.addIndexes(x, yield) overload
 - calling back into python now always ensures GIL
 - added IndexWriter.addDocument(..., yield) overloads
 - added support for pythonic Reader.read() and Reader.read(int)
 - added python charTokenizer and tokenFilter factory methods to TokenStream
 - added support for StandardAnalyzer.STOP_WORDS
 - added IndexReaderEnumeration to iterate over documents of IndexReader
 - made IndexReader iterable
 - made Document more pythonic, added __iter__, __getitem__, and __delitem__
 - added support for TermDocs.read(), returns a tuple of int arrays
 - added support for Calendar and GregorianCalendar
 - added Field.Keyword(String, Date)
 - fixed bug in ParallelMultiSearcher, search methods need to yield GIL
 - fixed bugs in returning SortField[] and ScoreDoc[]
 - fixed uncaught exception in __del__() bug
 - added support for TopDocs and TopFieldDocs constructors
 - consolidated object array return code
 - added support for NumberFormat, DecimalFormat
 - added support for query factory python extension of QueryParser, with super
 - added support for downcasting and instanceof operators on ScoreDoc
 - fixed bug in jcomparableArray type checker
 - fixed bug in passing int[]
 - added support for Spans, SpanQuery.getSpans()
 - most "Lucene in Action" samples and test cases ported to python/PyLucene
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to