When the Schwartzian Transform gets out of control...
I recently hacked up a quick-n-dirty helper for selectively mirroring or updating RPMs. The algorithm I wanted was - collect the list available from the remote server; - collect the list installed locally; - for those packages available from the remote server, where the same-name package is installed locally: - ignore unless it's newer than what's installed locally; - if multiple newer versions are offered, pick the newest Those last two really put me off until I noticed that with Perl-RPM, the invocation use RPM qw(vercmp) provides an RPM package version comparison routine, hooray (the exact algorithm for comparing rpm version strings is baroque). When I got around to writing the thing, here's what came out. Sort of an elaborated and extended Schwartzian Transform. The naughty bits can be found quickly by searching for map and sort. I think the maps with big tagging regexps are particulary juicy. I'm undecided whether this is good. -Bennett #!/usr/bin/perl -w use strict; =head1 NAME rpm-mirror --- compute RPM files to download from http server =head1 SYNOPSIS rpm-mirror URL =head1 DESCRIPTION rpm-mirror compute the list of rpms available from the indicated URL, which are interesting, according to the following criteria, and prints that list on stdout. It compares that list with the list of rpms currently installed on the local system; any rpms offered that don't have an rpm of the same name installed locally, are ignored. Any rpms offered for which the local version is newer or equal, are ignored. Of the remainder, the newest (greatest version/release) offered version's URL is printed out, in full, on stdout. Reasonable invocations for various contexts would include rpm -U `rpm-mirror url` and rpm-mirror url|xargs wget -c =cut die syntax: $0 URL\n unless @ARGV == 1; my $url = shift; use LWP::Simple; $_ = get($url) || die $0: get($url) failed: $!\n; # filename, pkgname, ver, rel my @available = map { [ /^((.*)-([^-]+)-([^-]+)\.[^.]+\.rpm)$/ ] } /a\s+href=([^]+\.rpm)/gi; # pkgname, ver, rel my @installed = map { [ /^(.*)-([^-]+)-([^-]+)$/ ] } `rpm -qa`; my %installed = map { $_-[0], $_ } @installed; my %candidates; use RPM qw(vercmp); for (@available) { my $name = $_-[1]; next unless exists($installed{$name}); next unless vercmp(@{$installed{$name}}[1,2], @{$_}[2,3]) 0; push @{$candidates{$name}}, $_; } $url =~ s/\/$//; for (sort keys %candidates) { if ($#{$candidates{$_}} == 0) { print $url/ . $candidates{$_}-[0][0] . \n; next; } my @sorted = map { $_-[0] } sort { vercmp(@{$b}[2,3], @{$a}[2,3]) } @{$candidates{$_}}; print $url/ . $sorted[0] . \n; } msg02919/pgp0.pgp Description: PGP signature
Re: Converting a textfile-like string to an array and back
On Tue, 11 Feb 2003 at 10:39:53 +1100, Andrew Savige wrote: In case anyone is unaware of (-ugene's status in world golf, you can find out by typing in Eugene van der Pijll at www.googlism.com. I just typed in 'Eugene' to check. eugene is the fastest and most pleasant eugene is about the size of a dog eugene is devastated But, most relevantly, eugene is right Ian -- phillipps is adequate but i wouldn't rave
Re: Converting a textfile-like string to an array and back
Ian Phillipps sprak: I just typed in 'Eugene' to check. eugene is the fastest and most pleasant eugene is about the size of a dog eugene is devastated But, most relevantly, eugene is right In my experience, Eugene (and Ton) are always right. I just noticed that Yitzchak's email address at www.efn.org is a Eugene Free network. :-) /-\ http://greetings.yahoo.com.au - Yahoo! Greetings - Send some online love this Valentine's Day.
Re: When the Schwartzian Transform gets out of control...
Bennett == Bennett Todd [EMAIL PROTECTED] writes: Bennett When I got around to writing the thing, here's what came out. Sort Bennett of an elaborated and extended Schwartzian Transform. The naughty Bennett bits can be found quickly by searching for map and sort. I Bennett think the maps with big tagging regexps are particulary juicy. Bennett I'm undecided whether this is good. If you need only the best (or worst) of the list, a sort is probably overkill. Just do a high-water mark scan, keeping the best candidate as you compare it with each other candidate. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 [EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/ Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: When the Schwartzian Transform gets out of control...
Randal writes: = If you need only the best (or worst) of the list, a sort is probably = overkill. Just do a high-water mark scan, keeping the best = candidate as you compare it with each other candidate. For fixed (but arbitrarily large) K, it is possible to find the K smallest or K largest elements of an N-element list in O(N) time. Cormen/Leiserson/Rivest's Algorithms book had the solution to this task. I can look up the page numbers if someone cares. As always, though, it is a tradeoff between programmer and machine efficincies. peace, || Barefoot, female, solar engineers: --{kr.pA} || http://tinyurl.com/4ra8 -- ledgerdemain, n.: Sleight of hand in preparing the balance sheet.
Re: When the Schwartzian Transform gets out of control...
Sorting 1000 items involves a _lot_ more than 1000 comparisons - which is really all you need. If you insist on your approach you should memoize vercmp() or use the orcish maneuver in your sort() callback. (Memoizing will accelerate it across all of your script, but the orcish maneuvre will accelerate your callback more because it avoids unnecessary function calls. Using both at once is not likely to be any worthwhile optimization over simply memoizing due to double bookkeeping though.) But why build a list with all available versions and sorting it, when you can discard less recent new packages right as you build it? You're also wasting effort in other places. # pkgname, ver, rel my @installed = map { [ /^(.*)-([^-]+)-([^-]+)$/ ] } `rpm -qa`; my %installed = map { $_-[0], $_ } @installed; my %installed = map { /^(.*)-([^-]+)-([^-]+)$/ ? ($1, [$2,$3]) : ()} `rpm -qa`; my @available = map { [ /^((.*)-([^-]+)-([^-]+)\.[^.]+\.rpm)$/ ] } /a\s+href=([^]+\.rpm)/gi; my %available; for(/a\s+href=([^]+\.rpm)/gi) { next unless /^((.*)-([^-]+)-([^-]+)\.[^.]+\.rpm)$/ and exists $installed{$1} and -1 == vercmp($2,$3, @{$installed{$_}}); $available{$1} = [$0, $2, $3] if not exists $available{$1} or -1 == vercmp($2,$3, @{$available{$1}}[1,2]); } # now printing results is trivial print map $url/.$available{$_}[0].\n, sort keys %available; -- Regards, Aristotle
Re: When the Schwartzian Transform gets out of control...
* A. Pagaltzis [EMAIL PROTECTED] [2003-02-12 07:35]: and -1 == vercmp($2,$3, @{$installed{$_}}); That is, of course, and -1 == vercmp($2,$3, @{$installed{$1}}); -- Regards, Aristotle
RE: Converting a textfile-like string to an array and back
Andrew Savige [mailto:[EMAIL PROTECTED]] wrote: Winter Christian wrote: # String to array: @lines=$x=~/[^\n]/g; This one splits into array of chars not array of lines. Shame on me, seems like I was unable to read what the question was. -Christian