Re: [KinoSearch] inside-out objects

Marvin Humphrey Tue, 20 Nov 2007 17:48:36 -0800


On Nov 20, 2007, at 12:18 PM, Peter Karman wrote:

Are you finding it makes it easier to do things with XS, C and the
reference counting?

KS objects under anything other than the new, temporary classKinoSearch::Util::Nat maintain their own refcount, separate fromPerl. When a Perl object wrapping a KS object has its SvREFCNT fallto 0, the DESTROY method which gets called isKinoSearch::Util::Obj::DESTROY, which simply decrements the KSobject's internal refcount rather than invoking Kino_Obj_Destroy(obj).


  void
  DESTROY(self)
      kino_Obj *self;
  PPCODE:
      REFCOUNT_DEC(self);

We have to do things that way because there are many KS objects whichPerl doesn't know about. For instance, when TopDocCollector's Cconstructor TDColl_new() is invoked, it creates its own HitQueueobject without telling Perl anything about it. However, should weneed to deal with that HitQueue from Perl-space, we have to wrap itin a Perl object. That's what happens here:


  {
      my $hit_queue = $collector->get_hit_queue;
  } # $hit_queue goes out of scope, DESTROY called

Currently, when that $hit_queue goes out of scope, the Perl wrapperobject gets destroyed. However, the interior KS HitQueue object mustnot be destroyed, because $collector still needs it.

As a consequence, KS objects can reappear wrapped in severaldifferent Perl objects, which is rather strange and is probably a bugwaiting to bite someone. Here's an example of how things can gowrong: cycling through multiple Perl objects doesn't work well withthe inside-out pattern, because DESTROY gets invoked over and overagain, necessitating a broken hack like this...


  sub DESTROY {
     my $self = shift;
     if ($self->refcount < 2) {
        delete $inside_out_var{$$self};
     }
     $self->SUPER::DESTROY;
  }

That hack doesn't even work reliably because if the last refcountgets decremented by KS internally, the Perl DESTROY method will neverget called and any inside-out vars will leak.

The solution is to cache a Perl object within a KS object, so thateffectively Perl *does* know about it. That's the difference betweenNat and Obj. Under Nat, the refcounting is handled via the cachedPerl object. There are no longer two refcounts.

One drawback of this design, though, is that Perl objects areheavyweight. That's ok for big stuff like a PostingList, but it'snot-so-great for small stuff like a ByteBuf, a Token, or a TermInfo.If we were to put a Perl object into every last one of those, I'd beconcerned both about memory usage and performance.

My current plan is to override the refcounting infrastructure forsmall classes by basing them off of a "FastObj" class which will usean integer refcount as Obj does now. The scheme is more complicatedto implement than I'd like, and it will have the one-KS-object-many-Perl-objects problem for anything that subclasses FastObj. But itwill work in the near term and maybe it won't be so bad.


Marvin Humphrey
Rectangular Research
http://www.rectangular.com/

Re: [KinoSearch] inside-out objects

Reply via email to