Funny you're asking, I'm in the process of porting the 'Lucene in Action' samples to python and PyLucene. I'm pretty much done with the analysis package of the samples and had to add missing pieces to PyLucene along the way, such as being able to read from a Reader, for instance.


Take a look at PyLucene/samples/LuceneInAction/lia/analysis, there are numerous examples of how to implement an analyzer, filter, token stream, etc.. in python.

I just checked this batch of changes in.

Andi..

On Thu, 3 Feb 2005, Richie Hindle wrote:


[Andi]
Yes, you can write an Analyzer implementation in python, take a look at the
test_PositionIncrement.py unit test suite.

I'm trying to write my own Analyzer, and I can't figure it out. test_PositionIncrement.py doesn't use the Reader object that gets passed to the Analyzer, and I can't see how to use that Reader. This is what I have:

import re
from PyLucene import *

class _TokenStream(object):
   def __init__(self, reader):
       text = reader.read()  # This fails - what can I do with reader??
       self.tokens = re.findall(u'[a-z]+', text.lower())
       self.i = 0

   def next(self):
       if self.i == len(self.tokens):
           return None
       t = Token(self.tokens[self.i], self.i, self.i)
       self.i += 1
       return t

class MyAnalyzer(object):
   def tokenStream(self, fieldName, reader):
       return _TokenStream(reader)

analyzer = MyAnalyzer()
store = RAMDirectory()
writer = IndexWriter(store, analyzer, True)
d = Document()
d.add(Field.Text("field", "This is a test."))
writer.addDocument(d)
writer.optimize()
writer.close()

I can't see what do with the Reader object.  Printing dir(reader) doesn't show
any methods, and neither does looking at the definition of Reader in
PyLucene.py.

Thanks for any advice,

--
Richie Hindle
[EMAIL PROTECTED]

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to