When the Schwartzian Transform gets out of control...

2003-02-11 Thread Bennett Todd
I recently hacked up a quick-n-dirty helper for selectively
mirroring or updating RPMs. The algorithm I wanted was

 - collect the list available from the remote server;
 - collect the list installed locally;
 - for those packages available from the remote server, where the
   same-name package is installed locally:
- ignore unless it's newer than what's installed locally;
- if multiple newer versions are offered, pick the newest

Those last two really put me off until I noticed that with Perl-RPM,
the invocation use RPM qw(vercmp) provides an RPM package version
comparison routine, hooray (the exact algorithm for comparing rpm
version strings is baroque).

When I got around to writing the thing, here's what came out. Sort
of an elaborated and extended Schwartzian Transform. The naughty
bits can be found quickly by searching for map and sort. I
think the maps with big tagging regexps are particulary juicy.

I'm undecided whether this is good.

-Bennett

#!/usr/bin/perl -w
use strict;

=head1 NAME

  rpm-mirror --- compute RPM files to download from http server

=head1 SYNOPSIS

  rpm-mirror URL

=head1 DESCRIPTION

rpm-mirror compute the list of rpms available from the indicated
URL, which are interesting, according to the following criteria,
and prints that list on stdout.

It compares that list with the list of rpms currently installed on
the local system; any rpms offered that don't have an rpm of the
same name installed locally, are ignored.

Any rpms offered for which the local version is newer or equal, are
ignored.

Of the remainder, the newest (greatest version/release) offered
version's URL is printed out, in full, on stdout. Reasonable
invocations for various contexts would include

rpm -U `rpm-mirror url`

and

rpm-mirror url|xargs wget -c

=cut

die syntax: $0 URL\n unless @ARGV == 1;
my $url = shift;

use LWP::Simple;
$_ = get($url) || die $0: get($url) failed: $!\n;
# filename, pkgname, ver, rel
my @available = map { [ /^((.*)-([^-]+)-([^-]+)\.[^.]+\.rpm)$/ ] } 
/a\s+href=([^]+\.rpm)/gi;
# pkgname, ver, rel
my @installed = map { [ /^(.*)-([^-]+)-([^-]+)$/ ] } `rpm -qa`;
my %installed = map { $_-[0], $_ } @installed;

my %candidates;
use RPM qw(vercmp);
for (@available) {
my $name = $_-[1];
next unless exists($installed{$name});
next unless vercmp(@{$installed{$name}}[1,2], @{$_}[2,3])  0;
push @{$candidates{$name}}, $_;
}

$url =~ s/\/$//;

for (sort keys %candidates) {
if ($#{$candidates{$_}} == 0) {
print $url/ . $candidates{$_}-[0][0] . \n;
next;
}
my @sorted = map { $_-[0] } sort { vercmp(@{$b}[2,3], @{$a}[2,3]) } 
@{$candidates{$_}};
print $url/ . $sorted[0] . \n;
}



msg02919/pgp0.pgp
Description: PGP signature


Re: Converting a textfile-like string to an array and back

2003-02-11 Thread Ian Phillipps
On Tue, 11 Feb 2003 at 10:39:53 +1100, Andrew Savige wrote:

 In case anyone is unaware of (-ugene's status in world golf, you can
 find out by typing in Eugene van der Pijll at www.googlism.com.

I just typed in 'Eugene' to check.

eugene is the fastest and most pleasant
eugene is about the size of a dog
eugene is devastated

But, most relevantly,

eugene is right


Ian
--
phillipps is adequate but i wouldn't rave



Re: Converting a textfile-like string to an array and back

2003-02-11 Thread Andrew Savige
Ian Phillipps sprak:
 I just typed in 'Eugene' to check.
 
 eugene is the fastest and most pleasant
 eugene is about the size of a dog
 eugene is devastated
 
 But, most relevantly,
 
 eugene is right

In my experience, Eugene (and Ton) are always right.
I just noticed that Yitzchak's email address at www.efn.org is a
Eugene Free network. :-)

/-\


http://greetings.yahoo.com.au - Yahoo! Greetings
- Send some online love this Valentine's Day.



Re: When the Schwartzian Transform gets out of control...

2003-02-11 Thread Randal L. Schwartz
 Bennett == Bennett Todd [EMAIL PROTECTED] writes:

Bennett When I got around to writing the thing, here's what came out. Sort
Bennett of an elaborated and extended Schwartzian Transform. The naughty
Bennett bits can be found quickly by searching for map and sort. I
Bennett think the maps with big tagging regexps are particulary juicy.

Bennett I'm undecided whether this is good.

If you need only the best (or worst) of the list, a sort is probably
overkill.  Just do a high-water mark scan, keeping the best
candidate as you compare it with each other candidate.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
[EMAIL PROTECTED] URL:http://www.stonehenge.com/merlyn/
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!



Re: When the Schwartzian Transform gets out of control...

2003-02-11 Thread Kripa Sundar
Randal writes:

= If you need only the best (or worst) of the list, a sort is probably
= overkill.  Just do a high-water mark scan, keeping the best
= candidate as you compare it with each other candidate.

For fixed (but arbitrarily large) K, it is possible to find
the K smallest or K largest elements of an N-element list in
O(N) time.  Cormen/Leiserson/Rivest's Algorithms book had the
solution to this task.  I can look up the page numbers if
someone cares.

As always, though, it is a tradeoff between programmer and machine
efficincies.

peace,  || Barefoot, female, solar engineers:
--{kr.pA}   || http://tinyurl.com/4ra8
-- 
ledgerdemain, n.: Sleight of hand in preparing the balance sheet.



Re: When the Schwartzian Transform gets out of control...

2003-02-11 Thread A. Pagaltzis
Sorting 1000 items involves a _lot_ more than 1000
comparisons - which is really all you need.  If you insist
on your approach you should memoize vercmp() or use the
orcish maneuver in your sort() callback. (Memoizing will
accelerate it across all of your script, but the orcish
maneuvre will accelerate your callback more because it
avoids unnecessary function calls. Using both at once is
not likely to be any worthwhile optimization over simply
memoizing due to double bookkeeping though.)

But why build a list with all available versions and sorting
it, when you can discard less recent new packages right as
you build it?

You're also wasting effort in other places.

 # pkgname, ver, rel
 my @installed = map { [ /^(.*)-([^-]+)-([^-]+)$/ ] } `rpm -qa`;
 my %installed = map { $_-[0], $_ } @installed;

my %installed = map { /^(.*)-([^-]+)-([^-]+)$/ ? ($1, [$2,$3]) : ()} `rpm -qa`;

 my @available = map { [ /^((.*)-([^-]+)-([^-]+)\.[^.]+\.rpm)$/ ] } 
/a\s+href=([^]+\.rpm)/gi;

my %available;
for(/a\s+href=([^]+\.rpm)/gi) {
next unless
/^((.*)-([^-]+)-([^-]+)\.[^.]+\.rpm)$/
and exists $installed{$1}
and -1 == vercmp($2,$3, @{$installed{$_}});

$available{$1} = [$0, $2, $3]
if not exists $available{$1}
or -1 == vercmp($2,$3, @{$available{$1}}[1,2]);
}

# now printing results is trivial

print map $url/.$available{$_}[0].\n, sort keys %available;

-- 
Regards,
Aristotle



Re: When the Schwartzian Transform gets out of control...

2003-02-11 Thread A. Pagaltzis
* A. Pagaltzis [EMAIL PROTECTED] [2003-02-12 07:35]:
   and -1 == vercmp($2,$3, @{$installed{$_}});

That is, of course,

and -1 == vercmp($2,$3, @{$installed{$1}});

-- 
Regards,
Aristotle



RE: Converting a textfile-like string to an array and back

2003-02-11 Thread Winter Christian
Andrew Savige [mailto:[EMAIL PROTECTED]] wrote:
 Winter Christian wrote:
  # String to array:
  @lines=$x=~/[^\n]/g;
 
 This one splits into array of chars not array of lines.

Shame on me, seems like I was unable to
read what the question was.

-Christian