On Saturday 12 October 2002 01:07, Brian Ingerson wrote: > At least on my machine, the parsing isn't the real culprit. I'm eager to > see Nadim's results. ;)
Hi all, I finally got tired of this parsing non sense. Why should I parse hundreds of KB of code to find the one function I know will be there since I put it there in the first place. ParseRecDescent and ParseRegExp are fine as long as the code is unknown or not too difficult to parse (ParseRegExp fails on flex generated code (you can't blame it for that ;-)) So here is a build with ParseRecDescent [nadim@khemir Inline]$ perl inline_flex.pl Time for Build Prepocess Stage: 0.0142 secs Called parser from Inline::C /usr/local/lib/perl5/site_perl/5.8.0/Inline/C.pm 279 Time for Build Parse Stage: 94.2852 secs Time for Build Glue 1 Stage: 0.0055 secs Time for Build Glue 2 Stage: 0.0006 secs Time for Build Glue 3 Stage: 0.0012 secs Time for "perl Makefile.PL" Stage: 0.4824 secs Time for "make" Stage: 3.3978 secs Time for "make install" Stage: 0.2574 secs Time for Cleaning Up Stage: 0.0001 secs Time for Build Compile Stage: 4.1486 secs Total Build Time: 98.4598 secs You might understand my frustration when there is only one function to extract from the code (the code is about 100 KB in this case but on the 500 KB case I went to bed and it was still parsing when I woke up) Devlopping anything with this kind of parse time was close to impossible. Using ParseRegExp didn't work at all. To eliminate parsing I wrote a ... Parser. In with ParseManual an ugly little hack that does the job. If your parsing time is way too long this might help. ParseManual is not even pre-beta it's plain hack, it's not tested more than with the example function. If it fails to do something you think it should, let me know. With ParseManual you must declare the function you want to be exposed, in my case (works with whatever comment type you have); %{ // ParseManual: int yylex(void) %} Result: [nadim@khemir Inline]$ perl inline_flex.pl Time for Build Prepocess Stage: 0.0146 secs Called parser from Inline::C /usr/local/lib/perl5/site_perl/5.8.0/Inline/C.pm 279 Time for Build Parse Stage: 0.0023 secs Time for Build Glue 1 Stage: 0.0073 secs Time for Build Glue 2 Stage: 0.0006 secs Time for Build Glue 3 Stage: 0.0012 secs Time for "perl Makefile.PL" Stage: 0.4732 secs Time for "make" Stage: 3.3018 secs Time for "make install" Stage: 0.2579 secs Time for Cleaning Up Stage: 0.0001 secs Time for Build Compile Stage: 4.0381 secs Total Build Time: 4.0687 secs cheers nadim.
# Nadim Khemir ([EMAIL PROTECTED]) # shamelessly hacked on ParseRegExp from Inline 0.44 Trial 4 # This parser looks for the string 'ParseManual:' followed by the definition of the # function it should extract # ex: // ParseManual: void Function(int * arg, int arg_2) package Inline::C::ParseManual ; use strict ; use Carp ; #-------------------------------------------------------------------------------------- sub register { return ( { extends => ['C'] , overrides => ['get_parser'] } ) ; } #-------------------------------------------------------------------------------------- sub get_parser { bless {}, 'Inline::C::ParseManual' ; } #-------------------------------------------------------------------------------------- sub code { my($self,$code) = @_ ; while($code =~ /(ParseManual:.*)$/mg) { my $current_declaration = $1 ; if($current_declaration =~ /ParseManual:\s*(.*)\s([a-zA-Z_][a-zA-Z0-9_]+)\s*\((.*)\)/) { my($return_type, $function, $args) = ($1,$2,$3); #print " found ($return_type, $function, $args)\n" ; $return_type = NormalizeType($return_type); my($arg_index, @arg_names, @arg_types) = (0) ; foreach my $arg (split ',', $args) { if(my($type, $name) = $arg =~ /(.*)\s+([a-zA-Z_][a-zA-Z0-9_]+)?/) { my $type = NormalizeType($type) ; die "$current_declaration, '$type' is not a valid type\n" unless $self->{data}{typeconv}{valid_types}{$type} ; unless(defined $name) { $name = "arg$arg_index" ; $arg_index++ ; } $name =~ s/\s+/ /g ; $name =~ s/^\s// ; $name =~ s/\s$// ; push @arg_types, $type ; push @arg_names, $name ; } elsif($arg =~ /^\s*void\s*$/) { } elsif($arg =~ /^\s*\.\.\.\s*$/) { push @arg_names, '...' ; push @arg_types, '...' ; } else { die "ParseManual don't understand: $arg\n" ; } } push @{$self->{data}{functions}}, $function ; $self->{data}{function}{$function}{return_type}= $return_type ; $self->{data}{function}{$function}{arg_names} = [@arg_names] ; $self->{data}{function}{$function}{arg_types} = [@arg_types] ; $self->{data}{done}{$function} = 1 ; } else { die "ParseManual Invalid: $current_declaration\n" ; } } return 1 ; } #-------------------------------------------------------------------------------------- sub NormalizeType { # Normalize a type for lookup in a typemap. my($type) = @_; # Remove "extern". # But keep "static", "inline", "typedef", etc, # to cause desirable typemap misses. $type =~ s/\bextern\b//g; # Whitespace: only single spaces, none leading or trailing. $type =~ s/\s+/ /g; $type =~ s/^\s//; $type =~ s/\s$//; # Adjacent "derivative characters" are not separated by whitespace, # but _are_ separated from the adjoining text. # [ Is really only * (and not ()[]) needed??? ] $type =~ s/\*\s\*/\*\*/g; $type =~ s/(?<=[^ \*])\*/ \*/g; return $type; } #-------------------------------------------------------------------------------------- 1;