Package: devscripts
Version: 2.10.30
Severity: wishlist
Tags: patch
User: [EMAIL PROTECTED]
Usertags: uscan

Let's take as an example a common sf.net project file:

http://sf.net/kcheckgmail/kcheckgmail-(.*)\.tar\.gz

If the redirector redirects the request to let's say heanet, the regex used to 
match the href is still:

(?:(?:http://qa.debian.org)?
\/watch\/sf\.php\/kcheckgmail\/)?kcheckgmail-(.*)\.tar\.gz

But it would be nice if it also tried to match whatever $response->base 
returns. So in case the redirector redirects the request to kent it still 
works.

This feature would be very very useful in case of the redirector and some 
other sites that not only redirect to another site, but actually change the 
directory structure.

Attached is a patch adding this feature, together with a HTTP header called 
X-uscan-features which is now set to 'enhaced-matching'. The latter was added 
so, at least, the sf.net redirector can now balance the mirror's load to even 
more addresses if that header is present on the request, as not to break 
watch files being checked with an old uscan.

Example: 
$ uscan.pl --report --verbose --debug
-- Scanning for watchfiles in .
-- Found watchfile in ./debian
-- In debian/watch, processing watchfile line:
   http://sf.net/ffmpeg-php/ffmpeg-php-(.*)\.tbz2
uscan.pl debug: requesting URL http://qa.debian.org/watch/sf.php/ffmpeg-php/
uscan.pl debug: base URI: 
http://www.mirrorservice.org/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/
uscan.pl debug: received content:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 
Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd";>
<HTML LANG="en-GB">
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<TITLE>UK Mirror Service: 
sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php</TITLE>
<LINK REL="STYLESHEET" HREF="/styles/main.css">
</HEAD>
<BODY BGCOLOR="#ffffff" TEXT="#000000" LINK="#0000ff" VLINK="#800080" 
ALINK="#ff0000">
<FORM METHOD=GET ACTION="http://search.mirrorservice.org/search/"; 
STYLE="padding:0; margin:0">
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
 <TR>
[...]
<TR>
<TD><A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.2.1.tbz2"><IMG
 
ALT="[binary]" SRC="/nams/icons/br/binary.gif" BORDER=0 ALIGN=left VSPACE=0 
HSPACE=0></A>&nbsp;<A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.2.1.tbz2">ffmpeg-php-0.5.2.1.tbz2</A></TD>
<TD ALIGN=right>&nbsp;1.5M&nbsp;&nbsp;06-Apr-2008&nbsp;</TD>
<TD></TD>
</TR>
<TR>
<TD><A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.1.tbz2"><IMG
 
ALT="[binary]" SRC="/nams/icons/br/binary.gif" BORDER=0 ALIGN=left VSPACE=0 
HSPACE=0></A>&nbsp;<A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.1.tbz2">ffmpeg-php-0.5.3.1.tbz2</A></TD>
<TD ALIGN=right>&nbsp;288.5K&nbsp;&nbsp;10-Jun-2008&nbsp;</TD>
<TD></TD>
</TR>
<TR>
<TD><A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.tbz2"><IMG
 
ALT="[binary]" SRC="/nams/icons/br/binary.gif" BORDER=0 ALIGN=left VSPACE=0 
HSPACE=0></A>&nbsp;<A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.tbz2">ffmpeg-php-0.5.3.tbz2</A></TD>
<TD ALIGN=right>&nbsp;1.5M&nbsp;&nbsp;31-May-2008&nbsp;</TD>
<TD></TD>
[...]
</TR>
</TABLE>
<TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0><TR>
  <TD WIDTH=3 HEIGHT=8><IMG SRC="/nams/icons/blank.gif" WIDTH=3></TD>
  <TD WIDTH=483 COLSPAN=3><IMG SRC="/nams/icons/lha_top.gif" WIDTH=483 
HEIGHT=8></TD>
</TR></TABLE>
<TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0><TR>
  <TD WIDTH=3><IMG SRC="/nams/icons/blank.gif" ALT="" WIDTH=3></TD>
  <TD WIDTH=2 BGCOLOR="#000000"><IMG SRC="/nams/icons/blank.gif" ALT="" 
WIDTH=2></TD>
  <TD WIDTH=6><IMG SRC="/nams/icons/blank.gif" ALT="" WIDTH=3></TD>
  <TD VALIGN=top><FONT FACE="Arial, arial, sans serif, Sans Serif, sans-serif" 
SIZE="2">
<A 
HREF="/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/[confdisp]">Customise
 
display options</A><BR>
Select the name of a file to download it.
<BR>
<TABLE BORDER=0 CELLPADDING=2 CELLSPACING=0>
 <TR>
  <TD VALIGN=top>
<FONT FACE="Arial, arial, sans serif, Sans Serif, sans-serif" SIZE="2">
<IMG SRC="/nams/icons/br/squishdir.gif" ALT="[archive]"> - an archive of other 
files.<BR CLEAR=all>
   </FONT>
  </TD>
  <TD VALIGN=top>
<FONT FACE="Arial, arial, sans serif, Sans Serif, sans-serif" SIZE="2">
<IMG SRC="/nams/icons/br/peek.gif" ALT="[browse]"> - peek inside an 
archive.<BR CLEAR=all>
   </FONT>
  </TD>
 </TR>
</TABLE>
<A HREF="/help/browser_icons.html">Further help</A>
</FONT>
      </FONT>
     </TD>
    </TR>
    <TR>
     <TD COLSPAN=3><IMG SRC="/nams/icons/down_arrow.gif" WIDTH=8 
HEIGHT=12></TD>
     <TD></TD>
    </TR>
   </TABLE>
  </TD>
 </TR>
</TABLE>
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH=500>
 <TR>
   <TD ALIGN="left"><FONT FACE="Arial, arial, sans serif, Sans Serif, 
sans-serif" SIZE="2"><FONT SIZE=-2>   <A HREF="/legal.txt">Terms of Use</A>
   </FONT></FONT></TD>
   <TD ALIGN="right"><FONT FACE="Arial, arial, sans serif, Sans Serif, 
sans-serif" SIZE="2"><FONT SIZE=-2>   Comments or Questions:
   <A HREF="/help/contact.html">contact us</A><BR>
  </FONT></FONT></TD>
 </TR>
</TABLE>
</BODY>
</HTML>
[End of received content]
uscan.pl debug: matching pattern(s) (?:(?:http://qa.debian.org)?
\/watch\/sf\.php\/ffmpeg\-php\/)?ffmpeg-php-(.*)\.tbz2 (?:
(?:http://www.mirrorservice.org)?
\/sites\/download\.sourceforge\.net\/pub\/sourceforge\/f\/ff\/ffmpeg\-php\/)?ffmpeg-php-(.*)\.tbz2
-- Found the following matching hrefs:
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.2.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.2.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.3.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.3.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.4.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.4.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.5.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.5.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.6.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.6.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.7.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.7.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.8.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.8.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.9.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.4.9.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.0.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.0.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.2.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.2.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.1.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.tbz2
     
/sites/download.sourceforge.net/pub/sourceforge/f/ff/ffmpeg-php/ffmpeg-php-0.5.3.tbz2
Newest version on remote site is 0.5.3.1, local version is 0.5.2.1
 => ffmpeg-php-0.5.3.1.tbz2 already in package directory
-- Scan finished

Kind regards,
-- 
Atomo64 - Raphael

Please avoid sending me Word, PowerPoint or Excel attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
Index: /home/raphael/Deb/devscripts/trunk/scripts/uscan.pl
===================================================================
--- /home/raphael/Deb/devscripts/trunk/scripts/uscan.pl	(revision 1515)
+++ /home/raphael/Deb/devscripts/trunk/scripts/uscan.pl	(working copy)
@@ -653,13 +653,17 @@
 
     my $origline = $line;
     my ($base, $site, $dir, $filepattern, $pattern, $lastversion, $action);
+    my (@patterns, $response_uri);
     my %options = ();
 
     my ($request, $response);
     my ($newfile, $newversion);
     my $style='new';
     my $urlbase;
+    my $headers = HTTP::Headers->new;
 
+    # Please separate the features with commas, only add them if needed
+    $headers->header('X-uscan-features' => 'enhaced-matching');
     %dehs_tags = ('package' => $pkg);
 
     if ($watch_version == 1) {
@@ -787,6 +791,8 @@
 	return 1;
     }
 
+    push @patterns, $pattern;
+
     # What is the most recent file, based on the filenames?
     # We first have to find the candidates, then we sort them using
     # Devscripts::Versort::versort
@@ -795,13 +801,30 @@
 	    die "$progname: you must have the libcrypt-ssleay-perl package installed\nto use https URLs\n";
 	}
 	print STDERR "$progname debug: requesting URL $base\n" if $debug;
-	$request = HTTP::Request->new('GET', $base);
+	$request = HTTP::Request->new('GET', $base, $headers);
 	$response = $user_agent->request($request);
 	if (! $response->is_success) {
 	    warn "$progname warning: In watchfile $watchfile, reading webpage\n  $base failed: " . $response->status_line . "\n";
 	    return 1;
 	}
 
+	$response_uri = $response->base;
+	if (! defined($response_uri)) {
+		warn "$progname warning: In watchfile $watchfile, failed to get base URI: \n";
+	}
+	
+	print STDERR "$progname debug: base URI: $response_uri\n"
+		if $debug;
+
+	if (defined($response_uri)) {
+		my $base_dir = $response_uri;
+		
+		$base_dir =~ s%^\w+://[^/]+/%/%;
+		if ($response_uri =~ m%^(\w+://[^/]+)%) {
+			push @patterns, "(?:(?:$1)?" . quotemeta($base_dir) . ")?$filepattern";
+		}
+	}
+
 	my $content = $response->content;
 	print STDERR "$progname debug: received content:\n$content\[End of received content]\n"
 	    if $debug;
@@ -821,27 +844,29 @@
 	    ($urlbase = $base) =~ s%/[^/]*$%/%;
 	}
 
-	print STDERR "$progname debug: matching pattern $pattern\n" if $debug;
+	print STDERR "$progname debug: matching pattern(s) @patterns\n" if $debug;
 	my @hrefs;
 	while ($content =~ m/<\s*a\s+[^>]*href\s*=\s*([\"\'])(.*?)\1/gi) {
 	    my $href = $2;
-	    if ($href =~ m&^$pattern$&) {
-		if ($watch_version == 2) {
-		    # watch_version 2 only recognised one group; the code
-		    # below will break version 2 watchfiles with a construction
-		    # such as file-([\d\.]+(-\d+)?) (bug #327258)
-		    push @hrefs, [$1, $href];
-		} else {
-		    # need the map { ... } here to handle cases of (...)?
-		    # which may match but then return undef values
-		    my $mangled_version =
-			join(".", map { $_ if defined($_) }
-			     $href =~ m&^$pattern$&);
-		    foreach my $pat (@{$options{'uversionmangle'}}) {
-			eval "\$mangled_version =~ $pat;";
+	    foreach my $_pattern (@patterns) {
+		    if ($href =~ m&^$_pattern$&) {
+			if ($watch_version == 2) {
+			    # watch_version 2 only recognised one group; the code
+			    # below will break version 2 watchfiles with a construction
+			    # such as file-([\d\.]+(-\d+)?) (bug #327258)
+			    push @hrefs, [$1, $href];
+			} else {
+			    # need the map { ... } here to handle cases of (...)?
+			    # which may match but then return undef values
+			    my $mangled_version =
+				join(".", map { $_ if defined($_) }
+				     $href =~ m&^$_pattern$&);
+			    foreach my $pat (@{$options{'uversionmangle'}}) {
+				eval "\$mangled_version =~ $pat;";
+			    }
+			    push @hrefs, [$mangled_version, $href];
+			}
 		    }
-		    push @hrefs, [$mangled_version, $href];
-		}
 	    }
 	}
 	if (@hrefs) {

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to