On Thu, 17 Jan 2008, anurag uniyal wrote:

I have a custom analyzer which uses custom tokenizers in its tokenStream method. (see attached code)

I couldn't find any attached code in your message. Maybe the list software is stripping them...

Now tokenStream may be called several times over which i have no control.

This looks like the problem Brian is facing too.
I suggested he keep track of his BrianFilter instances and call finalize() on them after each call to indexWriter.addDocument() by adding code for this purpose on his custom analyzer.

See 
http://lists.osafoundation.org/pipermail/pylucene-dev/2008-January/002232.html

Otherwise I will wrap tokenStream method to keep track of custometokenizers and finalize them once i get StopIteration.

It looks like another candidate for a decorator here.
  @finalizer
  def tokenStream(self):
      return stuff...

  and finalizer() would be defined to add the return value to some list that
  would then we iterated with calls to finalize().

This is actually looking like the background thread idea I had suggested earlier in that I would add code to store all such extension instances in a list on the env object returned by initVM(). Then, the background thread would walk this list and finalize() anything that is only referenced by the list (in addition to the deadly embrace ref, of course).

It also looks like the FinalizerWrapper class sugested earlier will finalize() things that are in use by the Java VM but that are no longer referenced in python. This would cause problems. finalize() should only be called once one is absolutely __sure__ that no one, not the Python VM nor the Java VM is using the objects in question.

The background thread idea I had suggested would, instead of finalize()'ing the objects once no other python refs are found, replace the global java ref part of the deadly embrace with a global java weak ref instead. This would allow Java to retain the object until it's done with it itself. In other words, the actual finalization of the object would happen when Java eventually collects the object.

Andi..

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to