On Sat, Nov 23, 2019 at 01:25:16PM +0100, Marc Espie wrote: > On Sat, Nov 23, 2019 at 12:01:34PM +0000, Stuart Henderson wrote: > > Before: > > # spent 125s > > After: > > # spent 806ms
Thanks! I meant to look further than just tidying with no functional change, but ran out of time to actually look at the sqlports schema. > > Index: Utils.pm > > =================================================================== > > RCS file: /cvs/ports/infrastructure/lib/OpenBSD/PortGen/Utils.pm,v > > retrieving revision 1.4 > > diff -u -p -r1.4 Utils.pm > > --- Utils.pm 19 Nov 2019 22:43:31 -0000 1.4 > > +++ Utils.pm 23 Nov 2019 12:00:14 -0000 > > @@ -64,7 +64,7 @@ sub module_in_ports > > } ) or die "failed to connect to database: $DBI::errstr"; > > > > my @results = @{ $dbh->selectcol_arrayref( > > - "SELECT FULLPKGPATH FROM Ports WHERE DISTNAME LIKE ?", > > + "SELECT FULLPKGPATH FROM PortsQ WHERE DISTNAME LIKE ?", > > {}, "$module%" > > ) }; > > Woah. That code is horrible. It also spends a lot of time > connecting/disconnecting. > > - The database connection should be a singleton > - sqlports should always be opened ReadOnly > - The query should be prepared. > > Apart from that, PortsQ will help, but the basic code should be taken out > and shot. Originally portgen would only ever port a single module and so this function only ever got run once, but, now that we can hit it many times, it does make sense to prepare once and execute many times. Since we no longer select flavors we can ask SQLite to sort by length for us and directly loop over the resulting rows, which is probably a useless optimization. Thoughts? Index: lib/OpenBSD/PortGen/Utils.pm =================================================================== RCS file: /cvs/ports/infrastructure/lib/OpenBSD/PortGen/Utils.pm,v retrieving revision 1.5 diff -u -p -r1.5 Utils.pm --- lib/OpenBSD/PortGen/Utils.pm 23 Nov 2019 14:59:39 -0000 1.5 +++ lib/OpenBSD/PortGen/Utils.pm 23 Nov 2019 21:46:28 -0000 @@ -18,6 +18,7 @@ package OpenBSD::PortGen::Utils; use 5.012; use warnings; +use feature qw( state ); use parent qw( Exporter ); @@ -47,12 +48,8 @@ sub ports_dir { $ENV{PORTSDIR} || '/usr/ sub base_dir { ports_dir() . '/mystuff' } -sub module_in_ports +sub _module_list_sth { - my ( $module, $prefix ) = @_; - - return unless $module and $prefix; - my $dbfile = '/usr/local/share/sqlports'; die "install databases/sqlports and databases/p5-DBD-SQLite\n" unless -e $dbfile; @@ -64,22 +61,32 @@ sub module_in_ports RaiseError => 1, } ) or die "failed to connect to database: $DBI::errstr"; - my @results = @{ $dbh->selectcol_arrayref( - "SELECT _paths.fullpkgpath FROM _paths JOIN _paths p1 ON p1.pkgpath=_paths.id - JOIN _ports ON _ports.fullpkgpath=p1.id WHERE _ports.distname LIKE ?", - {}, "$module%" - ) }; - - $dbh->disconnect(); - - # just returning the shortest one that's a module of the same ecosystem - # this should be improved - @results = sort { length $a <=> length $b } @results; + return $dbh->prepare(q{ + SELECT _paths.fullpkgpath FROM _paths + JOIN _paths p1 ON p1.pkgpath=_paths.id + JOIN _ports ON _ports.fullpkgpath=p1.id + WHERE _ports.distname LIKE ? + ORDER BY LENGTH(_paths.fullpkgpath) + }); +} + +sub module_in_ports +{ + my ( $module, $prefix ) = @_; + return unless $module and $prefix; + + state $sth = _module_list_sth(); + END { undef $sth }; # Bus error if destroyed during global destruction + + $sth->execute("$module%"); # this works well enough in practice, but can't rely on it # see devel/perltidy - for (@results) { - return $_ if /\/$prefix/; + while ( my ($path) = $sth->fetchrow_array ) { + if ( $path =~ m{/$prefix} ) { + $sth->finish; + return $path; + } } # Many ports, in particular python ports, start with $prefix,