> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Ofer Nave
> Sent: Monday, March 26, 2007 5:01 PM
>
> I reimplemented my FooAnalyzer using this pattern and it
> works now. I still don't know why, but at least it works. :)
Ever since I started using a custom Analyer and TokenFilter, my index build
script keeps crashing. Usually it just freezes at a random point, and won't
even respond to ctrl-c (I have to use kill -9 in another terminal). One
time it ended with: 'Fatal Python error: This thread state must be current
when releasing'. One time it finished successfully (out of about 20
attempts). This is from repeated runs without changing any code.
I'm not creating any threads. It's a straight python script, no apache or
web stuff involved. The only change has been the custom analyzer and
tokenfilter.
For reference:
---
class TermJoinTokenFilter(object):
TOKEN_TYPE_JOINED = "JOINED"
def __init__(self, tokenStream):
self.tokenStream = tokenStream
self.a = None
self.b = None
def __iter__(self):
return self
def next(self):
if self.a: # emitted prev last time - need to set next, emit prev +
next, and reset prev to None
self.b = self.tokenStream.next()
if self.b is None:
return None
joined = PyLucene.Token(self.a.termText() + self.b.termText(),
self.a.startOffset(), self.a.endOffset(), self.TOKEN_TYPE_JOINED)
joined.setPositionIncrement(0)
self.a = None
return joined
elif self.b: # emitted prev + next last time - need to emit next,
set prev to next, and reset next to None
self.a = self.b
self.b = None
return self.a
else: # first call ever - set prev to first token and emit first
token
self.a = self.tokenStream.next()
return self.a
class TermJoinAnalyzer(object):
def __init__(self, analyzer=PyLucene.StandardAnalyzer()):
self.analyzer = analyzer
def tokenStream(self, fieldName, reader):
tokenStream = self.analyzer.tokenStream(fieldName, reader)
filter = TermJoinTokenFilter(tokenStream)
return tokenStream.tokenFilter(filter)
---
-ofer
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev