[Andi]
> Yes, you can write an Analyzer implementation in python, take a look at the 
> test_PositionIncrement.py unit test suite.

I'm trying to write my own Analyzer, and I can't figure it out.
test_PositionIncrement.py doesn't use the Reader object that gets passed to
the Analyzer, and I can't see how to use that Reader.  This is what I have:

import re
from PyLucene import *

class _TokenStream(object):
    def __init__(self, reader):
        text = reader.read()  # This fails - what can I do with reader??
        self.tokens = re.findall(u'[a-z]+', text.lower())
        self.i = 0
    
    def next(self):
        if self.i == len(self.tokens):
            return None
        t = Token(self.tokens[self.i], self.i, self.i)
        self.i += 1
        return t

class MyAnalyzer(object):
    def tokenStream(self, fieldName, reader):
        return _TokenStream(reader)

analyzer = MyAnalyzer()
store = RAMDirectory()
writer = IndexWriter(store, analyzer, True)
d = Document()
d.add(Field.Text("field", "This is a test."))
writer.addDocument(d)
writer.optimize()
writer.close()

I can't see what do with the Reader object.  Printing dir(reader) doesn't show
any methods, and neither does looking at the definition of Reader in
PyLucene.py.

Thanks for any advice,

-- 
Richie Hindle
[EMAIL PROTECTED]

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to