On Saturday 12 October 2002 01:07, Brian Ingerson wrote:
> At least on my machine, the parsing isn't the real culprit. I'm eager to
> see Nadim's results. ;)

Hi all, I finally got tired of this parsing non sense. Why should I parse 
hundreds of KB of code to find the one function I know will be there since 
I put it there in the first place.

ParseRecDescent and ParseRegExp are fine as long as the code is unknown or 
not too difficult to parse (ParseRegExp fails on flex generated code (you 
can't blame it for that ;-))

So here is a build with ParseRecDescent

[nadim@khemir Inline]$ perl inline_flex.pl
Time for Build Prepocess Stage: 0.0142 secs
Called parser from Inline::C 
/usr/local/lib/perl5/site_perl/5.8.0/Inline/C.pm 279
Time for Build Parse Stage: 94.2852 secs
Time for Build Glue 1 Stage: 0.0055 secs
Time for Build Glue 2 Stage: 0.0006 secs
Time for Build Glue 3 Stage: 0.0012 secs
  Time for "perl Makefile.PL" Stage: 0.4824 secs
  Time for "make" Stage: 3.3978 secs
  Time for "make install" Stage: 0.2574 secs
  Time for Cleaning Up Stage: 0.0001 secs
Time for Build Compile Stage: 4.1486 secs
Total Build Time: 98.4598 secs

You might understand my frustration when there is only one function to 
extract from the code (the code is about 100 KB in this case but on the 
500 KB case I went to bed and it was still parsing when I woke up)

Devlopping anything with this kind of parse time was close to impossible.

Using ParseRegExp didn't work at all. To eliminate parsing I wrote a ...
Parser.

In with ParseManual an ugly little hack that does the job. If your parsing 
time is way too long this might help. ParseManual is not even pre-beta 
it's plain hack, it's not tested more than with the example function. If 
it fails to do something you think it should, let me know.

With ParseManual you must declare the function you want to be exposed, in 
my case (works with whatever comment type you have);
%{
// ParseManual: int yylex(void)
%}

Result:

[nadim@khemir Inline]$ perl inline_flex.pl
Time for Build Prepocess Stage: 0.0146 secs
Called parser from Inline::C 
/usr/local/lib/perl5/site_perl/5.8.0/Inline/C.pm 279
Time for Build Parse Stage: 0.0023 secs
Time for Build Glue 1 Stage: 0.0073 secs
Time for Build Glue 2 Stage: 0.0006 secs
Time for Build Glue 3 Stage: 0.0012 secs
  Time for "perl Makefile.PL" Stage: 0.4732 secs
  Time for "make" Stage: 3.3018 secs
  Time for "make install" Stage: 0.2579 secs
  Time for Cleaning Up Stage: 0.0001 secs
Time for Build Compile Stage: 4.0381 secs
Total Build Time: 4.0687 secs

cheers nadim.
# Nadim Khemir ([EMAIL PROTECTED])
# shamelessly hacked on ParseRegExp from Inline 0.44 Trial 4

# This parser looks for the string 'ParseManual:' followed by the definition of the 
# function it should extract
# ex: // ParseManual: void Function(int * arg, int arg_2)

package Inline::C::ParseManual ;
use strict ;
use Carp ;

#--------------------------------------------------------------------------------------

sub register 
{
return
   (
      {
        extends   => ['C']
      , overrides => ['get_parser']
      }
   ) ;
}

#--------------------------------------------------------------------------------------

sub get_parser 
{
bless {}, 'Inline::C::ParseManual' ;
}

#--------------------------------------------------------------------------------------

sub code 
{
my($self,$code) = @_ ;
    
while($code =~ /(ParseManual:.*)$/mg)
   {
   my $current_declaration = $1 ;

   if($current_declaration  =~ /ParseManual:\s*(.*)\s([a-zA-Z_][a-zA-Z0-9_]+)\s*\((.*)\)/)
      { 
      my($return_type, $function, $args) = ($1,$2,$3);
      #print " found ($return_type, $function, $args)\n" ;        

      $return_type = NormalizeType($return_type);
    
      my($arg_index, @arg_names, @arg_types) = (0) ;
      foreach my $arg (split ',', $args) 
         {
         if(my($type, $name) = $arg =~ /(.*)\s+([a-zA-Z_][a-zA-Z0-9_]+)?/)
            {
            my $type = NormalizeType($type) ;

            die "$current_declaration, '$type' is not a valid type\n" 
                unless $self->{data}{typeconv}{valid_types}{$type} ;
		
            unless(defined $name)
               {
               $name = "arg$arg_index" ;
               $arg_index++ ;
               }

            $name =~ s/\s+/ /g ; $name =~ s/^\s// ; $name =~ s/\s$// ;
         
            push @arg_types, $type ;
            push @arg_names, $name ;
            }
         elsif($arg =~ /^\s*void\s*$/) 
            {
            }
         elsif($arg =~ /^\s*\.\.\.\s*$/) 
            {
            push @arg_names, '...' ;
            push @arg_types, '...' ;
            }
         else 
            {
            die "ParseManual don't understand: $arg\n" ;
            }
         }
      
      push @{$self->{data}{functions}}, $function ;
      $self->{data}{function}{$function}{return_type}= $return_type ; 
      $self->{data}{function}{$function}{arg_names} = [@arg_names] ;
      $self->{data}{function}{$function}{arg_types} = [@arg_types] ;
      $self->{data}{done}{$function} = 1 ;
      }
   else
      {
      die "ParseManual Invalid: $current_declaration\n" ;
      }
   }
 
return 1 ;
}

#--------------------------------------------------------------------------------------

sub NormalizeType 
{
# Normalize a type for lookup in a typemap.
my($type) = @_;

# Remove "extern".
# But keep "static", "inline", "typedef", etc,
#  to cause desirable typemap misses.
$type =~ s/\bextern\b//g;

# Whitespace: only single spaces, none leading or trailing.
$type =~ s/\s+/ /g;
$type =~ s/^\s//; $type =~ s/\s$//;

# Adjacent "derivative characters" are not separated by whitespace,
# but _are_ separated from the adjoining text.
# [ Is really only * (and not ()[]) needed??? ]
$type =~ s/\*\s\*/\*\*/g;
$type =~ s/(?<=[^ \*])\*/ \*/g;

return $type;
}

#--------------------------------------------------------------------------------------

1;

Reply via email to