On Fri, 15 Sep 2006, Rob Young wrote:

Python: 2.4.3
PyLucene: 2.0.0

I am trying to write a custom Analyser and TokenFilter but I keep getting a NullPointerException whenever I try to place another filter after mine. If I change the order of the filters so that mine is last everything is fine. Any ideas on what the problem may be?

Also, not a huge problem, but a little confusing. Why do I always have to override the constructor, even if I am adding nothing of significance?

Look at the examples in tests/test_PositionIncrementTestCase.py.
There a custom python analyzer is setup. The lucene python 'extension' mechanism isn't really an 'extension' mechanism, as a wrapper in reverse.
(See the testSetPosition unit test).

Basically, a PythonAnalyzer Java class implements native methods that call a python object that implements the Analyzer protocol. There is hence no need to make your python analyzer a subclass of Analyzer. A subclass of object is enough.

Andi..



from PyLucene import \
  Analyzer, TokenFilter, StringReader, \
  StandardTokenizer, LowerCaseFilter

class TestAnalyzer( Analyzer ):
  def __init__( self ):
      pass
  def tokenStream( self, reader ):
      result = StandardTokenizer( reader )
      # If I change the order of these two filters
      # it works OK
      result = LowerCaseFilter( result )
      result = TestFilter( result )
      return result

class TestFilter( TokenFilter ):
  def __init__( self, input ):
      self.input = input
  def __iter__( self ):
      return self
  def next( self ):
      token = self.input.next()
      if not token:
          raise StopIteration
      return token

text = "A little chunK oF Text foR Me to analyze as a test for this problem I'm having"
tokenstream = TestAnalyzer().tokenStream( StringReader( text ) )
for token in tokenstream:
  print token.termText()


_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to