On Mon, 2008-02-18 at 18:57 -0600, Raphael Geissert wrote:
> Adam D. Barratt wrote:
> 
> > On Tue, 2008-02-12 at 21:04 +0000, Adam D. Barratt wrote:
> > 
[...]
> > Further investigation suggests that transitioning wnpp-alert is
> > currently impractical.
> > 
> > The three QA pages parsed by the script contain a total of 482 bugs. In
> > contrast, there are 1948 open bugs against wnpp. Merely select()ing the
> > list of bugs takes significantly longer than an entire wnpp-alert
> > invocation; we'd then need to call status() for each of the bugs in
> > order to determine whether the subject started with O, RFA or RFH.
[...]
> Just make sure to select only open bugs (and probably also ignore the
> pending ones) and call status() with the whole list of bugs (eval is your
> friend in this case). Doing it that way was, as I said, fast an easy.

Nah, /perl/ is my friend in this case ;-)

It's easy, but I wouldn't say it was fast. I've attached a patch that
replaces the HTML parsing with SOAP calls (admittedly including pending
bugs; 129 out of 1945 isn't going to make a huge difference). It works,
but testing it here it takes between five and 10 times longer than the
original script. Admittedly my 'net connection's not the best, but the
tests were all run in the same environment so should be comparable.

Adam
--- scripts/wnpp-alert.sh	2008-02-19 20:07:28.000000000 +0000
+++ wnpp-alert-soap.sh	2008-02-19 20:10:58.000000000 +0000
@@ -79,21 +79,19 @@
       0 1 2 3 7 10 13 15
 fi
 
-# Here's a really sly sed script.  Rather than first grepping for
-# matching lines and then processing them, this attempts to sed
-# every line; those which succeed execute the 'p' command, those
-# which don't skip over it to the label 'd'
-wget -q -O $WNPPTMP http://www.debian.org/devel/wnpp/orphaned || \
-    { echo "wnpp-alert: wget http://www.debian.org/devel/wnpp/orphaned failed" >&2; exit 1; }
-sed -ne 's/.*<li><a href="http:\/\/bugs.debian.org\/\([0-9]*\)">\([^:<]*\)[: ]*\([^<]*\)<\/a>.*/O \1 \2 -- \3/; T d; p; : d' $WNPPTMP > $WNPP
-
-wget -q -O $WNPPTMP http://www.debian.org/devel/wnpp/rfa_bypackage || \
-    { echo "wnpp-alert: wget http://www.debian.org/devel/wnpp/rfa_bypackage"; >&2; exit 1; }
-sed -ne 's/.*<li><a href="http:\/\/bugs.debian.org\/\([0-9]*\)">\([^:<]*\)[: ]*\([^<]*\)<\/a>.*/RFA \1 \2 -- \3/; T d; p; : d' $WNPPTMP >> $WNPP
-
-wget -q -O $WNPPTMP http://www.debian.org/devel/wnpp/help_requested || \
-    { echo "wnpp-alert: wget http://www.debian.org/devel/wnpp/help_requested"; >&2; exit 1; }
-sed -ne 's/.*<li><a href="http:\/\/bugs.debian.org\/\([0-9]*\)">\([^:<]*\)[: ]*\([^<]*\)<\/a>.*/RFH \1 \2 -- \3/; T d; p; : d' $WNPPTMP >> $WNPP
+perl -e '
+use lib "/usr/share/devscripts";
+use Devscripts::Debbugs;
+
+my $bugs = Devscripts::Debbugs::select("pkg:wnpp", "status:open", "status:forwarded");
+my $status = Devscripts::Debbugs::status($bugs);
+
+foreach my $bug (@{$bugs}) {
+    if ($status->{$bug}->{"subject"} =~	/^(O|RFA|RFH): (.*?) -- (.*?)$/) {
+      print "$1 $bug $2 -- $3\n";
+    }
+}
+' > $WNPP
 
 cut -f3 -d' ' $WNPP | sort > $WNPP_PACKAGES
 

Reply via email to