On Fri, 15 Sep 2006, Rob Young wrote:
Python: 2.4.3
PyLucene: 2.0.0
I am trying to write a custom Analyser and TokenFilter but I keep getting a
NullPointerException whenever I try to place another filter after mine. If I
change the order of the filters so that mine is last everything is fine. Any
ideas on what the problem may be?
Also, not a huge problem, but a little confusing. Why do I always have to
override the constructor, even if I am adding nothing of significance?
Look at the examples in tests/test_PositionIncrementTestCase.py.
There a custom python analyzer is setup. The lucene python 'extension'
mechanism isn't really an 'extension' mechanism, as a wrapper in reverse.
(See the testSetPosition unit test).
Basically, a PythonAnalyzer Java class implements native methods that call a
python object that implements the Analyzer protocol. There is hence no need to
make your python analyzer a subclass of Analyzer. A subclass of object is
enough.
Andi..
from PyLucene import \
Analyzer, TokenFilter, StringReader, \
StandardTokenizer, LowerCaseFilter
class TestAnalyzer( Analyzer ):
def __init__( self ):
pass
def tokenStream( self, reader ):
result = StandardTokenizer( reader )
# If I change the order of these two filters
# it works OK
result = LowerCaseFilter( result )
result = TestFilter( result )
return result
class TestFilter( TokenFilter ):
def __init__( self, input ):
self.input = input
def __iter__( self ):
return self
def next( self ):
token = self.input.next()
if not token:
raise StopIteration
return token
text = "A little chunK oF Text foR Me to analyze as a test for this problem
I'm having"
tokenstream = TestAnalyzer().tokenStream( StringReader( text ) )
for token in tokenstream:
print token.termText()
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev