[Andi]
> Yes, you can write an Analyzer implementation in python, take a look at the
> test_PositionIncrement.py unit test suite.
I'm trying to write my own Analyzer, and I can't figure it out.
test_PositionIncrement.py doesn't use the Reader object that gets passed to
the Analyzer, and I can't see how to use that Reader. This is what I have:
import re
from PyLucene import *
class _TokenStream(object):
def __init__(self, reader):
text = reader.read() # This fails - what can I do with reader??
self.tokens = re.findall(u'[a-z]+', text.lower())
self.i = 0
def next(self):
if self.i == len(self.tokens):
return None
t = Token(self.tokens[self.i], self.i, self.i)
self.i += 1
return t
class MyAnalyzer(object):
def tokenStream(self, fieldName, reader):
return _TokenStream(reader)
analyzer = MyAnalyzer()
store = RAMDirectory()
writer = IndexWriter(store, analyzer, True)
d = Document()
d.add(Field.Text("field", "This is a test."))
writer.addDocument(d)
writer.optimize()
writer.close()
I can't see what do with the Reader object. Printing dir(reader) doesn't show
any methods, and neither does looking at the definition of Reader in
PyLucene.py.
Thanks for any advice,
--
Richie Hindle
[EMAIL PROTECTED]
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev