On Thu, Feb 09, 2012 at 09:09:08PM +0100, Nick Wellnhofer wrote:
> On 09/02/2012 18:49, Marvin Humphrey wrote:
>> Rehashing our short exchange on IRC for the benefit of the list... I suggest
>> using PolyReader#open for this, since it returns NULL rather than throwing an
>> exception on failure. If open() is successful, the resulting reader can be
>> used as an argument to IndexSearcher#new:
>
> Actually, PolyReader#open doesn't return NULL if the index is empty.
>
> But the list of seg_readers will be empty, so we can test for that.
Ah, sorry for misremembering.
That behavior seems sub-optimal in retrospect.
Nevertheless, your revised approach will work.
> See the attached patch for my second attempt.
There is still potential for a schema conflict here. We have to get
$self->{type} out of the existing schema when the index exists.
if ( !@{ $reader->seg_readers } ) {
# index is empty, create new schema
$self->{schema} = Lucy::Plan::Schema->new;
}
else {
# get schema from reader
my $schema = $self->{schema} = $reader->get_schema;
my ($field) = @{ $schema->get_fields };
$self->{type} = $schema->fetch_type($field);
}
# Create a new FieldType if we haven't discovered one yet.
if ( !$self->{type} ) {
my $analyzer = Lucy::Analysis::EasyAnalyzer->new(
language => $language );
$self->{type} = Lucy::Plan::FullTextType->new(
analyzer => $analyzer, );
}
Aside from that, the patch looked good to me.
> Nick
> diff --git a/perl/lib/Lucy/Simple.pm b/perl/lib/Lucy/Simple.pm
> index aeb92c4..f09301b 100644
> --- a/perl/lib/Lucy/Simple.pm
> +++ b/perl/lib/Lucy/Simple.pm
> @@ -50,7 +50,6 @@ sub new {
> # Get type and schema.
> my $analyzer = Lucy::Analysis::EasyAnalyzer->new( language => $language
> );
> $self->{type} = Lucy::Plan::FullTextType->new( analyzer => $analyzer, );
> - my $schema = $self->{schema} = Lucy::Plan::Schema->new;
>
> # Cache the object for later clean-up.
> weaken( $obj_cache{ refaddr $self } = $self );
> @@ -61,6 +60,15 @@ sub new {
> sub _lazily_create_indexer {
> my $self = shift;
> if ( !defined $self->{indexer} ) {
> + my $reader = Lucy::Index::PolyReader->open( index => $self->{path} );
> + if ( ! @{ $reader->seg_readers } ) {
> + # index is empty, create new schema
> + $self->{schema} = Lucy::Plan::Schema->new;
> + }
> + else {
> + # get schema from reader
> + $self->{schema} = $reader->get_schema;
> + }
> $self->{indexer} = Lucy::Index::Indexer->new(
> schema => $self->{schema},
> index => $self->{path},
> @@ -70,11 +78,11 @@ sub _lazily_create_indexer {
>
> sub add_doc {
> my ( $self, $hashref ) = @_;
> - my $schema = $self->{schema};
> my $type = $self->{type};
> croak("add_doc requires exactly one argument: a hashref")
> unless ( @_ == 2 and reftype($hashref) eq 'HASH' );
> $self->_lazily_create_indexer;
> + my $schema = $self->{schema};
> $schema->spec_field( name => $_, type => $type ) for keys %$hashref;
> $self->{indexer}->add_doc($hashref);
> }