Re: parse does not return false when it should

Peter Marschall Fri, 23 Apr 2004 07:11:18 -0700

Hi Chris, hi Graham,

On Thursday 22 April 2004 21:01, Chris Ridd wrote:
> On 22/4/04 4:34 pm, Peter Marschall <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > On Tuesday 20 April 2004 11:39, Claude wrote:
> >> I tried this code which, IMO, confirms that "parse" wrongly does not
> >> return "undef" in case of error.
> >
> > I think this is the correct diagnosis.
> >
> > Can you try the attached version of Schema.pm and tell if parse() works
> > as documented ?
> >
> > With this version parse() should return undef on errors as it is stated
> > in the pod file.
>
> I wasn't sure if the pod was in error, or the code!
>
> > @Chris, Graham,
> > if you do not object I'l commit his change after Claude's tests.
>
> I haven't tried the code, but on inspection I'd be surprised if simply
> doing this will return undef:
>
>     $schema->_error("Foo");
>     return;


Why not ?

The first line is simply a rewrite of the original statement that makes use of 
the _error() method in Net::LDAP::Schema.
It has one side effect however: now parse() fails when used as a  function on 
an unblessed hash reference [i.e. parse(\%hash, "schema.ldif"); ]
According to 'man perlfunc' the second line "returns an empty list in list 
context, the undefined value in scalar context, and (of course) nothing at 
all in a void context".
So even if you'd call parse() in list context you'd be able to tell the 
difference between success and failure.
(I admit I do not know whether "return undef;" would make any difference.)

Even the one-liner
        return $schema->_error("Foo");
would have done the trick since _error()'s last statement is also a "return;". 

> Also, what happens when ->new("schema.ldif")  fails to parse the schema in
> the LDIF file? I *think* if parse returns undef then there's no way to
> figure out where the problem in the LDIF file was. (A problem when there
> are maybe a couple of thousand pieces of schema in the file :-)

IMHO that is a design question how the ->new(ARG)  interface with argument 
should behave:Should it return undef when parse(ARG) fails or should it 
succeed even if parse fails.
The former is the current behaviour with my parse() patch while the latter can 
be achieved with a minor change even if parse() is allowed to fail.

I can see the advantage of not failing i.e. knowing where the error was.
On the other hand I'd rather let if fail because I do not like objects to 
exist when the initialization failed.
In addition to that Net::LDAP::Schema offers the possibilitiy to split the
->new(ARG) into ->new() and ->parse(ARG) [as Claude did]
So the former case mightbe considered a short cut with coarse error handling 
while the latter offers full error handling.

Maybe we should let a higher authority (guess who) decide ;-)))

If you do not like this idea at all I can change the offending lines in new()
        return $schema unless @_;
        return $schema->parse( shift ) ? $schema : undef;
to
        $schema->parse(shift)  if (@_);
        return $schema;
to get parse() returning undef on error and new() not failing even if parse()
returns undef.

And since the old behaviour of ->new("0") failing even with existing legal 
LDIF file "0" was not ideal too I changed it as well

Please have a look at my second try as it is attached.

Peter

-- 
Peter Marschall
eMail: [EMAIL PROTECTED]

# Copyright (c) 1998-2004 Graham Barr <[EMAIL PROTECTED]>. All rights reserved.
# This program is free software; you can redistribute it and/or
# modify it under the same terms as Perl itself.

package Net::LDAP::Schema;

use strict;
use vars qw($VERSION);

$VERSION = "0.9902_02";

#
# Get schema from the server (or read from LDIF) and parse it into
# data structure
#
sub new {
  my $self = shift;
  my $type = ref($self) || $self;
  my $schema = bless {}, $type;

  $schema->parse(shift) if @_;
  return $schema;
}

sub _error {
  my $self = shift;
  $self->{error} = shift;
  return;
}


sub parse {
  my $schema = shift;
  my $arg = shift;

  unless (defined($arg)) {
    $schema->_error('Bad argument');
    return;
  }

  %$schema = ();

  my $entry;
  if( ref $arg ) {
    if (UNIVERSAL::isa($arg, 'Net::LDAP::Entry')) {
      $entry = $arg;
    }
    elsif (UNIVERSAL::isa($arg, 'Net::LDAP::Search')) {
      unless ($entry = $arg->entry) {
	$schema->_error('Bad Argument');
	return;
      }
    }
    else {
      $schema->_error('Bad Argument');
      return;
    }
  }
  elsif( -f $arg ) {
    require Net::LDAP::LDIF;
    my $ldif = Net::LDAP::LDIF->new( $arg, "r" );
    $entry = $ldif->read();
    unless( $entry ) {
      $schema->_error("Cannot parse LDIF from file [$arg]");
      return;
    }
  }
  else {
    $schema->_error("Can't load schema from [$arg]: $!");
    return;
  }

  eval {
    local $SIG{__DIE__} = sub {};
    _parse_schema( $schema, $entry );
  };

  if ($@) {
    $schema->_error($@);
    return;
  }

  return $schema;
}

#
# Dump as LDIF
#
# XXX - We should really dump from the internal structure. That way we can
#       have methods to modify the schema and write a new one -- GMB
sub dump {
  my $self = shift;
  my $fh = @_ ? shift : \*STDOUT;
  my $entry = $self->{'entry'} or return;
  require Net::LDAP::LDIF;
  Net::LDAP::LDIF->new($fh,"w", wrap => 0)->write($entry);
  1;
}

#
# Given another Net::LDAP::Schema, merge the contents together.
# XXX - todo
#
sub merge {
  my $self = shift;
  my $new = shift;

  # Go through structure of 'new', copying code to $self. Take some
  # parameters describing what to do in the event of a clash.
}


sub all_attributes		{ values %{shift->{at}}  }
sub all_objectclasses		{ values %{shift->{oc}}  }
sub all_syntaxes		{ values %{shift->{syn}} }
sub all_matchingrules		{ values %{shift->{mr}}  }
sub all_matchingruleuses	{ values %{shift->{mru}} }
sub all_ditstructurerules	{ values %{shift->{dts}} }
sub all_ditcontentrules		{ values %{shift->{dtc}} }
sub all_nameforms		{ values %{shift->{nfm}} }

sub superclass {
  my $self = shift;
  my $oc = shift;

  my $elem = $self->objectclass( $oc )
    or return scalar _error($self, "Not an objectClass");

  return @{$elem->{sup} || []};
}

sub must { _must_or_may(@_,'must') }
sub may  { _must_or_may(@_,'may')  }

#
# Return must or may attributes for this OC.
#
sub _must_or_may {
  my $self = shift;
  my $must_or_may = pop;
  my @oc = @_ or return;

  #
  # If called with an entry, get the OC names and continue
  #
  if ( UNIVERSAL::isa( $oc[0], "Net::LDAP::Entry" ) ) {
    my $entry = $oc[0];
    @oc = $entry->get_value( "objectclass" )
      or return;
  }

  my %res;
  my %done;

  while (@oc) {
    my $oc = shift @oc;

    $done{lc $oc}++ and next;

    my $elem = $self->objectclass( $oc ) or next;
    if (my $res  = $elem->{$must_or_may}) {
    @res{ @$res } = (); 	# Add in, getting uniqueness
    }
    my $sup = $elem->{sup} or next;
    push @oc, @$sup;
  }

  my %unique = map { ($_,$_) } $self->attribute(keys %res);
  values %unique;
}

#
# Given name or oid, return element or undef if not of appropriate type
#

sub _get {
  my $self = shift;
  my $type = pop(@_);
  my $hash = $self->{$type};
  my $oid  = $self->{oid};

  my @elem = grep $_, map {
    my $elem = $hash->{lc $_};

    ($elem or ($elem = $oid->{$_} and $elem->{type} eq $type))
      ? $elem
      : undef;
  } @_;

  wantarray ? @elem : $elem[0];
}

sub attribute		{ _get(@_,'at')  }
sub objectclass		{ _get(@_,'oc')  }
sub syntax		{ _get(@_,'syn') }
sub matchingrule	{ _get(@_,'mr')  }
sub matchingruleuse	{ _get(@_,'mru') }
sub ditstructurerule	{ _get(@_,'dts') }
sub ditcontentrule	{ _get(@_,'dtc') }
sub nameform		{ _get(@_,'nfm') }


#
# XXX - TODO - move long comments to POD and write up interface
#
# Data structure is:
#
# $schema (hash ref)
#
# The {oid} piece here is a little redundant since we control the other
# top-level members. We promote the first listed name to be 'canonical' and
# also make up a name for syntaxes (from the description). Thus we always
# have a unique name. This avoids a lot of checking in the access routines.
#
# ->{oid}->{$oid}->{
#			name	=> $canonical_name, (created for syn)
#			aliases	=> list of non. canon names
#			type	=> at/oc/syn
#			desc	=> description
#			must	=> list of can. names of mand. atts [if OC]
#			may	=> list of can. names of opt. atts [if OC]
#			syntax	=> can. name of syntax [if AT]
#			... etc per oid details
#
# These next items are optimisations, to avoid always searching the OID
# lists. Could be removed in theory. Each is a hash ref mapping
# lowercase names to the hash stored in the oid struucture
#
# ->{at}
# ->{oc}
# ->{syn}
# ->{mr}
# ->{mru}
# ->{dts}
# ->{dtc}
# ->{nfm}
#

#
# These items have no following arguments
#
my %flags = map { ($_,1) } qw(
			      single-value
			      obsolete
			      collective
			      no-user-modification
			      abstract
			      structural
			      auxiliary
			     );

#
# These items can have lists arguments
# (name can too, but we treat it special)
#
my %listops = map { ($_,1) } qw(must may sup);

#
# Map schema attribute names to internal names
#
my %type2attr = qw(
	at	attributetypes
	oc	objectclasses
	syn	ldapsyntaxes
	mr	matchingrules
	mru	matchingruleuse
	dts	ditstructurerules
	dtc	ditcontentrules
	nfm	nameforms
);

#
# Return ref to hash containing schema data - undef on failure
#

sub _parse_schema {
  my $schema = shift;
  my $entry = shift;

  return undef unless defined($entry);

  keys %type2attr; # reset iterator
  while(my($type,$attr) = each %type2attr) {
    my $vals = $entry->get_value($attr, asref => 1);

    my %names;
    $schema->{$type} = \%names;		# Save reference to hash of names => element

    next unless $vals;			# Just leave empty ref if nothing

    foreach my $val (@$vals) {
      #
      # The following statement takes care of defined attributes
      # that have no data associated with them.
      #
      next if $val eq '';

      #
      # We assume that each value can be turned into an OID, a canonical
      # name and a 'schema_entry' which is a hash ref containing the items
      # present in the value.
      #
      my %schema_entry = ( type => $type, aliases => [] );

      my @tokens;
      pos($val) = 0;

      push @tokens, $+
        while $val =~ /\G\s*(?:
                       ([()])
                      |
                       ([^"'\s()]+)
                      |
                       "([^"]*)"
                      |
                       '((?:[^']+|'[^\s)])*)'
                      )\s*/xcg;
      die "Cannot parse [$val] [",substr($val,pos($val)),"]" unless @tokens and pos($val) == length($val);

      # remove () from start/end
      shift @tokens if $tokens[0]  eq '(';
      pop   @tokens if $tokens[-1] eq ')';

      # The first token is the OID
      my $oid = $schema_entry{oid} = shift @tokens;

      while(@tokens) {
	my $tag = lc shift @tokens;

	if (exists $flags{$tag}) {
	  $schema_entry{$tag} = 1;
	}
	elsif (@tokens) {
	  if (($schema_entry{$tag} = shift @tokens) eq '(') {
	    my @arr;
	    $schema_entry{$tag} = [EMAIL PROTECTED];
	    while(1) {
	      my $tmp = shift @tokens;
	      last if $tmp eq ')';
	      push @arr,$tmp unless $tmp eq '$';

              # Drop of end of list ?
	      die "Cannot parse [$val] {$tag}" unless @tokens;
	    }
	  }

          # Ensure items that can be lists are stored as array refs
	  $schema_entry{$tag} = [ $schema_entry{$tag} ]
	    if exists $listops{$tag} and !ref $schema_entry{$tag};
	}
        else {
          die "Cannot parse [$val] {$tag}";
        }
      }

      #
      # Extract the maximum length of a syntax
      #
      $schema_entry{max_length} = $1
	if exists $schema_entry{syntax} and $schema_entry{syntax} =~ s/{(\d+)}//;

      #
      # Force a name if we don't have one
      #
      $schema_entry{name} = $schema_entry{oid}
	unless exists $schema_entry{name};

      #
      # If we have multiple names, make the name be the first and demote the rest to aliases
      #
      if (ref $schema_entry{name}) {
	my $aliases;
	$schema_entry{name} = shift @{$aliases = $schema_entry{name}};
	$schema_entry{aliases} = $aliases if @$aliases;
      }

      #
      # Store the elements by OID
      #
      $schema->{oid}->{$oid} = \%schema_entry;

      #
      # We also index elements by name within each type
      #
      foreach my $name ( @{$schema_entry{aliases}}, $schema_entry{name} ) {
	my $lc_name = lc $name;
	$names{lc $name} =  \%schema_entry;
      }
    }
  }

  $schema->{entry} = $entry;
  return $schema;
}




#
# Get the syntax of an attribute
#
sub attribute_syntax {
  my $self = shift;
  my $attr = shift;
  my $syntax;

  while ($attr) {
    my $elem = $self->attribute( $attr ) or return undef;

    $syntax = $elem->{syntax} and return $self->syntax($syntax);

    $attr = ${$elem->{sup} || []}[0];
  }

  return undef
}


sub error {
  $_[0]->{error};
}

#
# Return base entry
#
sub entry {
  $_[0]->{entry};
}

1;

Re: parse does not return false when it should

Reply via email to