On Mon, 26 Mar 2007, Ofer Nave wrote:
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ofer Nave
Sent: Monday, March 26, 2007 5:01 PM
I reimplemented my FooAnalyzer using this pattern and it
works now. I still don't know why, but at least it works. :)
Ever since I started using a custom Analyer and TokenFilter, my index build
script keeps crashing. Usually it just freezes at a random point, and won't
even respond to ctrl-c (I have to use kill -9 in another terminal). One
time it ended with: 'Fatal Python error: This thread state must be current
when releasing'. One time it finished successfully (out of about 20
attempts). This is from repeated runs without changing any code.
If you submit a piece of code that reproduces the problem, I can take a look
at it (best would be something like a unit test, see PyLucene/test).
Also, what is your OS ? did you build PyLucene yourself ? If so, which gcj ?
Does 'make test' pass ? What is your version of Python ?
Andi..
I'm not creating any threads. It's a straight python script, no apache or
web stuff involved. The only change has been the custom analyzer and
tokenfilter.
For reference:
---
class TermJoinTokenFilter(object):
TOKEN_TYPE_JOINED = "JOINED"
def __init__(self, tokenStream):
self.tokenStream = tokenStream
self.a = None
self.b = None
def __iter__(self):
return self
def next(self):
if self.a: # emitted prev last time - need to set next, emit prev +
next, and reset prev to None
self.b = self.tokenStream.next()
if self.b is None:
return None
joined = PyLucene.Token(self.a.termText() + self.b.termText(),
self.a.startOffset(), self.a.endOffset(), self.TOKEN_TYPE_JOINED)
joined.setPositionIncrement(0)
self.a = None
return joined
elif self.b: # emitted prev + next last time - need to emit next,
set prev to next, and reset next to None
self.a = self.b
self.b = None
return self.a
else: # first call ever - set prev to first token and emit first
token
self.a = self.tokenStream.next()
return self.a
class TermJoinAnalyzer(object):
def __init__(self, analyzer=PyLucene.StandardAnalyzer()):
self.analyzer = analyzer
def tokenStream(self, fieldName, reader):
tokenStream = self.analyzer.tokenStream(fieldName, reader)
filter = TermJoinTokenFilter(tokenStream)
return tokenStream.tokenFilter(filter)
---
-ofer
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev