Bug#752384: HEAnet sourceforge mirror is outdated

2014-09-11 Thread Daniel Lintott
Hi Paul,

On 11/09/14 04:20, Paul Wise wrote:
 On Tue, Jul 22, 2014 at 5:29 PM, Daniel Lintott wrote:
 
 I shall drop another version of the patch to the bug report that reverts
 the custom caching mechanism.
 
 A yak shaving exercise reminded me that this hasn't been done yet,
 could you please send the updated patch?
 

Apologies for not getting this done sooner, got tied up with some other
projects.

Attached is a new patch against the old sf.wml from SVN, that doesn't
include any caching mechanism.

Regards

Daniel
--- ../sf-redirect-old/sf.wml	2014-07-21 19:24:00.835216162 +0100
+++ sf.wml	2014-09-11 11:46:00.277558546 +0100
@@ -1,21 +1,10 @@
 ?php
-
-$data_dir = '/srv/qa.debian.org/data/watch';
-
 // need to strip leading slash, sf.net doesn't like double slashes
 $project=ltrim($_SERVER['PATH_INFO'], '/');
 
 if (!$project) {
-header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan');
-exit;
-}
-
-$fdb = $data_dir . '/sf-list.db';
-
-if (!file_exists($fdb)) {
-header('HTTP/1.0 500 Internal Server Error');
-die('The files database is not available. Please report this message to'.
-	' debian-qa@lists.debian.org');
+	header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan');
+	exit;
 }
 
 // $project is not a file and doesn't have trailing slash
@@ -29,40 +18,31 @@
 exit;
 }
 
-$db = dba_open($fdb, 'r', 'db4');
-
-if (!dba_exists($project, $db)) {
-header('HTTP/1.0 404 File Not Found');
-die('There is no information about the '.$project.' project.');
-}
+$xml_url = https://sourceforge.net/projects/$project/rss;;
 
-?html
+$xml = simplexml_load_file($xml_url, 'SimpleXMLElement', LIBXML_NOCDATA);
+$title = $xml-channel[0]-title;
+$files = $xml-channel[0]-item;
+?
+html
 head
-titleFile listing for project ?php echo htmlspecialchars($project); ?/title
+titleFile listing for project ?php echo $title; ?/title
 /head
 body
 p
-h1File listing for project ?php echo htmlspecialchars($project); ?/h1
-Visit a href=http://sf.net/projects/?php echo htmlspecialchars($project); ??php echo
-htmlspecialchars($project); ?'s project page/a.br/br/
+h1File listing for project ?php echo $title; ?/h1
+Visit a href=http://sf.net/projects/?php echo $project; ??php echo $project; ?'s project page/a.brbr
 ?php
-echo dba_fetch($project, $db);
+foreach ($files as $item) {
+	$file = basename($item-title);
+	$link = $_SERVER['SCRIPT_NAME'] . /$project/$file;
+	echo a href='$file'$file/abr\n;
+}
 ?
 /p
 p
-Thanks to a href=http://ftp.heanet.ie/;HEAnet's mirror service/a
-for being the source of data for this service.
-/p
-p
 Get the source code: a href=svn://anonscm.debian.org/svn/qa/trunk/wml/watchcheckout SVN repository/a #124;
 a href=http://anonscm.debian.org/viewvc/qa/trunk/wml/watch/;browse SVN repository/a
 /p
-p Last database update:
-?php echo date(DATE_RFC822, filemtime($fdb)); ?
-/p
 /body
-/html?php
-
-dba_close($db);
-
-?
+/html


signature.asc
Description: OpenPGP digital signature


Bug#752384: HEAnet sourceforge mirror is outdated

2014-09-11 Thread Paul Wise
On Thu, Sep 11, 2014 at 9:53 PM, Paul Wise wrote:

 Thanks! Committed and made live:

Daniel, there is one bug I'm hoping you can help with since I've
mostly forgotten how to write PHP.

URLs like this:

https://qa.debian.org/watch/sf.php/chromium-bsu

Need to be redirected to URLs like this:

https://qa.debian.org/watch/sf.php/chromium-bsu/

Otherwise the links within the page will go to the wrong place.

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAKTje6GvREw5Fer7YF4=uzYFwJ26ze5XRF+zNJA=kto3uw-...@mail.gmail.com



Bug#752384: HEAnet sourceforge mirror is outdated

2014-09-11 Thread Daniel Lintott
Hi Paul,

On 11/09/14 15:21, Paul Wise wrote:
 On Thu, Sep 11, 2014 at 9:53 PM, Paul Wise wrote:
 
 Thanks! Committed and made live:
 
 Daniel, there is one bug I'm hoping you can help with since I've
 mostly forgotten how to write PHP.
 
 URLs like this:
 
 https://qa.debian.org/watch/sf.php/chromium-bsu
 
 Need to be redirected to URLs like this:
 
 https://qa.debian.org/watch/sf.php/chromium-bsu/
 
 Otherwise the links within the page will go to the wrong place.
 

I've attached a patch which should solve this problem, in fact it was
already in my script just not used as you'll see.

I've tested the patch locally and it seems to function okay for URLs
both with and without a trailing slash

Any problems let me know

Regards

Daniel
Use the link which has been generated from the project and file 
name. This avoids complications with a URL having (or not having)
a trailing slash.

Index: sf.wml
===
--- sf.wml	(revision 3259)
+++ sf.wml	(working copy)
@@ -36,7 +36,7 @@
 foreach ($files as $item) {
 	$file = basename($item-title);
 	$link = $_SERVER['SCRIPT_NAME'] . /$project/$file;
-	echo a href='$file'$file/abr\n;
+	echo a href='$link'$file/abr\n;
 }
 ?
 /p


signature.asc
Description: OpenPGP digital signature


Bug#752384: HEAnet sourceforge mirror is outdated

2014-09-11 Thread Paul Wise
On Fri, Sep 12, 2014 at 12:59 AM, Daniel Lintott wrote:

 I've attached a patch which should solve this problem, in fact it was
 already in my script just not used as you'll see.

Excellent, applied!

BTW: we could always use more help with QA infra, more details here :)

https://wiki.debian.org/qa.debian.org

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/caktje6eon8gxert6xpfpgzxwj3ry8yu9i41ee+rqupctkog...@mail.gmail.com



Bug#752384: HEAnet sourceforge mirror is outdated

2014-09-10 Thread Paul Wise
On Tue, Jul 22, 2014 at 5:29 PM, Daniel Lintott wrote:

 I would say that seems like a sensible suggestion. Using a ready-made
 system for the caching obviously has major benefits of being
 well-established and supported.

 Personally I've never used or setup such a service, but there are
 several available: nginx[1]; trafficserver[2]; haproxy[3]

I expect we will just go with mod_cache from apache2, seems the simplest option.

 I shall drop another version of the patch to the bug report that reverts
 the custom caching mechanism.

A yak shaving exercise reminded me that this hasn't been done yet,
could you please send the updated patch?

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAKTje6FcABA=CQYd+6Dc+vKE9r5zVCXS1SZGvELyRG2i=2o...@mail.gmail.com



Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-22 Thread Daniel Lintott
On 22/07/14 02:05, Paul Wise wrote:
 On Tue, Jul 22, 2014 at 2:50 AM, Daniel Lintott wrote:
 
 Okay... It took a bit of thinking of how to work it, but I've come up
 with a working solution that caches the file list for each project
 requested.
 
 There was some discussion on IRC about the problem and a caching proxy
 was suggested instead:
 
 sf rss = rss to html converter = caching proxy = uscan
 
 Thinking about it some more, caching the output makes more sense than
 caching filenames and using a HTTP caching proxy is the usual way to
 cache HTML/websites so it might be best to just do that.
 
 Thoughts?
 

I would say that seems like a sensible suggestion. Using a ready-made
system for the caching obviously has major benefits of being
well-established and supported.

Personally I've never used or setup such a service, but there are
several available: nginx[1]; trafficserver[2]; haproxy[3]

[1] https://packages.debian.org/wheezy/nginx
[2] https://packages.debian.org/wheezy/trafficserver
[3] https://packages.debian.org/wheezy-backports/haproxy

I shall drop another version of the patch to the bug report that reverts
the custom caching mechanism.

Regards

Daniel




signature.asc
Description: OpenPGP digital signature


Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-21 Thread Daniel Lintott
On 21/07/14 04:11, Paul Wise wrote:
 Unfortunately the final word on that is that Debian needs to replace
 our current redirector with one based on the RSS feature and also to
 add a cache mechanism (they suggested a 1 hour cache time) so that we
 don't overload the RSS feature.
 
 Do you think it would be possible for you to add a caching mechanism
 and convert your version into a patch that we can apply to SVN?
 

It should definitely be possible to add a caching mechanism to the the
new redirector, currently I have a couple of ideas on this but both have
drawbacks.

1. Use a Berkeley DB to store the retrieved data, similar to what is
currently done.

Something like:
| project | file | update_time |

| |  | |

Problems I foresee:
* My intention would be to check at the time the script is requested if
the update_time  1 hour ago...
- if yes... get the new RSS and update the DB
- if no use the information from the DB

What happens if this happens for multiple requests at the same time?

2. Save the XML file to a cache folder

Then at request time check the time on that file and it's age.

The only problem I can see this causing is disk space (I don't know how
much of an issue this is for Debian)

The RSS file for the VPCS project is almost 52KB. Picking a figure out
of the air (as I've no idea how packages use the redirector) of 1,
this is going to create 520MB of cached files.

Obviously some projects may have a smaller RSS and others larger
which may skew that estimate.

Any comments on these would be most appreciated.

Cheers,

Daniel



signature.asc
Description: OpenPGP digital signature


Re: Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-21 Thread Stuart Prescott
Hi Daniel,

many thanks for your work on this!

 It should definitely be possible to add a caching mechanism to the the
 new redirector, currently I have a couple of ideas on this but both have
 drawbacks.
 
 1. Use a Berkeley DB to store the retrieved data, similar to what is
 currently done.

BDB may not be a particularly good choice either [1] -- there are other DBs 
suggested in that thread that seem to have a better future.

https://lists.debian.org/debian-devel/2014/06/msg00328.html

 Problems I foresee:
 * My intention would be to check at the time the script is requested if
 the update_time  1 hour ago...
 - if yes... get the new RSS and update the DB
 - if no use the information from the DB
 
 What happens if this happens for multiple requests at the same time?

If the update of the db is atomic (which is easy to arrange for most DBs), 
then I'd be tempted to ignore that you might very occasionally request the 
same RSS feed twice in quick succession. Others may disagree with me here... 
but I'd worry about that later if it is actually a problem.

An alternative to worrying about locking would be to only update the db from 
cron. This adds some latency to the scan which is annoying for the 
maintainer sometimes but not really an issue from the QA perspective. If the 
RSS feed offers a Last-Modified header for HEAD requests, then the cron job 
can be done easily and often (perhaps that should be investigated anyway?).

 2. Save the XML file to a cache folder
 
 Then at request time check the time on that file and it's age.
 
 The only problem I can see this causing is disk space (I don't know how
 much of an issue this is for Debian)

Rather than keeping all of them all the time, you could delete them as soon 
as they are older than the refresh time; that could be done with find from 
cron. If we were to split the QA cron jobs that check for outdated sources 
across the day, it would be easy to keep that number down to 10-20% of the 
total XML.

 The RSS file for the VPCS project is almost 52KB. Picking a figure out
 of the air (as I've no idea how packages use the redirector) of 1,
 this is going to create 520MB of cached files.

Just to help with one of the two random numbers:

udd= select count(*) from upstream where watch_file like '%sf.net/%';
 count 
---
  2409
(1 row)


hope that helps!

cheers
Stuart


-- 
Stuart Prescotthttp://www.nanonanonano.net/   stu...@nanonanonano.net
Debian Developer   http://www.debian.org/ stu...@debian.org
GPG fingerprint90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7




-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/lqisau$53r$1...@ger.gmane.org



Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-21 Thread Abou Al Montacir
Hi All,

On Mon, 2014-07-21 at 11:02 +0100, Daniel Lintott wrote:
 On 21/07/14 04:11, Paul Wise wrote:
  Unfortunately the final word on that is that Debian needs to replace
  our current redirector with one based on the RSS feature and also to
I honestly was expecting such answer!

  add a cache mechanism (they suggested a 1 hour cache time) so that we
  don't overload the RSS feature.
Are we really consuming so much bandwidth for that feature? I assume
this will happen each time a user or a daemon wants to check a
particular package. I'm not convinced this is worth especially they ask
for a cache of 1 hour, do we expect that per package we do a check more
than twice per day (daily daemon + random user)
 
  Do you think it would be possible for you to add a caching mechanism
  and convert your version into a patch that we can apply to SVN?
  
 
 It should definitely be possible to add a caching mechanism to the the
 new redirector, currently I have a couple of ideas on this but both have
 drawbacks.
 
 1. Use a Berkeley DB to store the retrieved data, similar to what is
 currently done.
 
 Something like:
 | project | file | update_time |
 
 | |  | |
 
 Problems I foresee:
 * My intention would be to check at the time the script is requested if
 the update_time  1 hour ago...
   - if yes... get the new RSS and update the DB
   - if no use the information from the DB
 
 What happens if this happens for multiple requests at the same time?
 
 2. Save the XML file to a cache folder
I like this one

 Then at request time check the time on that file and it's age.
 
 The only problem I can see this causing is disk space (I don't know how
 much of an issue this is for Debian)
 
 The RSS file for the VPCS project is almost 52KB. Picking a figure out
 of the air (as I've no idea how packages use the redirector) of 1,
 this is going to create 520MB of cached files.
 
 Obviously some projects may have a smaller RSS and others larger
 which may skew that estimate.
 
 Any comments on these would be most appreciated.
What about compressing the files? This can reduce the size dramatically.
Can you please check for the file you used as example?

Cheers,
Abou Al Montacir


signature.asc
Description: This is a digitally signed message part


Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-21 Thread Paul Wise
On Mon, 2014-07-21 at 15:39 +0200, Abou Al Montacir wrote:

 Are we really consuming so much bandwidth for that feature? I assume
 this will happen each time a user or a daemon wants to check a
 particular package. I'm not convinced this is worth especially they ask
 for a cache of 1 hour, do we expect that per package we do a check more
 than twice per day (daily daemon + random user)

I told them the average usage based on the stats from qa.d.o Apache logs
(up to 30K requests per day) and said that was a bit high and asked us
to implement a cache.

 What about compressing the files? This can reduce the size dramatically.
 Can you please check for the file you used as example?

Seems pointless to store the raw RSS, best extract the filenames and
store them in a database instead.

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-21 Thread Daniel Lintott
Control: -1 + patch

On 21/07/14 14:58, Paul Wise wrote:
 On Mon, 2014-07-21 at 15:39 +0200, Abou Al Montacir wrote:
 
 Are we really consuming so much bandwidth for that feature? I assume
 this will happen each time a user or a daemon wants to check a
 particular package. I'm not convinced this is worth especially they ask
 for a cache of 1 hour, do we expect that per package we do a check more
 than twice per day (daily daemon + random user)
 
 I told them the average usage based on the stats from qa.d.o Apache logs
 (up to 30K requests per day) and said that was a bit high and asked us
 to implement a cache.
 

That doesn't surprise me in the least! GetDeb actually switched to using
my test redirector and in 5 days I logged nearly 32000 hits at my
server... each of which would have been passed to sf.net (this was
quickly resolved though).

 What about compressing the files? This can reduce the size dramatically.
 Can you please check for the file you used as example?
 
 Seems pointless to store the raw RSS, best extract the filenames and
 store them in a database instead.
 

Okay... It took a bit of thinking of how to work it, but I've come up
with a working solution that caches the file list for each project
requested.

I am storing each projects' file list in a separate Berkeley DB so we
can check the file modification time and only update when the file is
older than the cache limit ($cache_time) in seconds (currently 3600
seconds).

Currently it is configured to store these files in a subdirectory of
cache ($cache_dir), which will need to be writeable by the web server.

Otherwise I don't think there is anything else particularly special to
report.

I have updated my test server and it is now running the latest version
of the script.

Regards,

Daniel
--- ../sf-redirect-old/sf.wml	2014-07-21 19:24:00.835216162 +0100
+++ sf.wml	2014-07-21 19:45:21.683113723 +0100
@@ -1,21 +1,12 @@
 ?php
-
-$data_dir = '/srv/qa.debian.org/data/watch';
-
 // need to strip leading slash, sf.net doesn't like double slashes
 $project=ltrim($_SERVER['PATH_INFO'], '/');
+$cache_dir = './cache';
+$cache_time = 3600;
 
 if (!$project) {
-header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan');
-exit;
-}
-
-$fdb = $data_dir . '/sf-list.db';
-
-if (!file_exists($fdb)) {
-header('HTTP/1.0 500 Internal Server Error');
-die('The files database is not available. Please report this message to'.
-	' debian-qa@lists.debian.org');
+	header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan');
+	exit;
 }
 
 // $project is not a file and doesn't have trailing slash
@@ -29,40 +20,60 @@
 exit;
 }
 
-$db = dba_open($fdb, 'r', 'db4');
+$db_file = $cache_dir/$project.db;
 
-if (!dba_exists($project, $db)) {
-header('HTTP/1.0 404 File Not Found');
-die('There is no information about the '.$project.' project.');
+if (file_exists($db_file) and time() - filemtime($db_file)  $cache_time ) {
+	# Open the db_file for reading
+	$db = dba_open($db_file, 'r', 'db4');
+} else {
+$xml_url = https://sourceforge.net/projects/$project/rss;;
+	# Update/create the db_file, then read it's contents
+	# Load the rss feed using simplexml
+	$xml = @simplexml_load_file($xml_url, 'SimpleXMLElement', LIBXML_NOCDATA);
+	if ($xml === false) {
+		echo No project named $project could be found, check the project name and try again;
+		exit;
+	} else {
+		# Get an array of files from the XML		
+		$files = $xml-channel[0]-item;
+		# Create a new db file
+		$db = dba_open($db_file . '-new', 'c', 'db4');
+		# Add the file list to the db
+		$i = 0;
+		foreach ($files as $item) {
+			dba_insert($i, basename($item-title),$db);
+			$i++;
+		}
+		dba_close($db);
+		rename($db_file . '-new', $db_file);
+		$db = dba_open($db_file, 'r', 'db4');
+	}
 }
-
-?html
+?
+html
 head
 titleFile listing for project ?php echo htmlspecialchars($project); ?/title
 /head
 body
 p
 h1File listing for project ?php echo htmlspecialchars($project); ?/h1
-Visit a href=http://sf.net/projects/?php echo htmlspecialchars($project); ??php echo
-htmlspecialchars($project); ?'s project page/a.br/br/
+Visit a href=http://sf.net/projects/?php echo htmlspecialchars($project); ??php 
+echo htmlspecialchars($project); ?'s project page/a.brbr
 ?php
-echo dba_fetch($project, $db);
+$key = dba_firstkey($db);
+while ($key !== False) {
+	$file = dba_fetch($key, $db);
+	$link = $_SERVER['SCRIPT_NAME'] . /$project/$file;
+	echo a href='$link'$file/abr\n;
+	$key = dba_nextkey($db);
+}
 ?
 /p
-p
-Thanks to a href=http://ftp.heanet.ie/;HEAnet's mirror service/a
-for being the source of data for this service.
-/p
+pLast database update: ?php echo date(DATE_RFC822, filemtime($db_file)); ?/p
 p
 Get the source code: a href=svn://anonscm.debian.org/svn/qa/trunk/wml/watchcheckout SVN repository/a #124;
 a href=http://anonscm.debian.org/viewvc/qa/trunk/wml/watch/;browse SVN repository/a
 /p
-p Last database update:
-?php echo date(DATE_RFC822, filemtime($fdb)); ?

Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-21 Thread Paul Wise
On Tue, Jul 22, 2014 at 2:50 AM, Daniel Lintott wrote:

 Okay... It took a bit of thinking of how to work it, but I've come up
 with a working solution that caches the file list for each project
 requested.

There was some discussion on IRC about the problem and a caching proxy
was suggested instead:

sf rss = rss to html converter = caching proxy = uscan

Thinking about it some more, caching the output makes more sense than
caching filenames and using a HTTP caching proxy is the usual way to
cache HTML/websites so it might be best to just do that.

Thoughts?

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/caktje6fxeifmyc2vvkn83bc9h9-b9pnw-vkggg-jb3sjazl...@mail.gmail.com



Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-20 Thread Paul Wise
On Tue, Jul 8, 2014 at 5:26 PM, Daniel Lintott wrote:

 Indeed... I have managed to replicate the functionality of the current
 SF redirector using the RSS feed
...
 Ack... That would be very nice to see. Let's hope they can come up with
 a nice solution.

Unfortunately the final word on that is that Debian needs to replace
our current redirector with one based on the RSS feature and also to
add a cache mechanism (they suggested a 1 hour cache time) so that we
don't overload the RSS feature.

Do you think it would be possible for you to add a caching mechanism
and convert your version into a patch that we can apply to SVN?

-- 
bye,
pabs

https://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAKTje6H-A0R4zLvETOTG4u+-yz6zgdD_Wre=j0mlhmo2+kd...@mail.gmail.com



Bug#752384: HEAnet sourceforge mirror is outdated (was Re: [Pkg-pascal-devel] Lazarus watch file is broken)

2014-07-13 Thread Abou Al Montacir
Hi Daniel,

On Sun, 2014-07-13 at 11:21 +0100, Daniel Lintott wrote:
 Hi,
 
...
 
 I think your actually the following the bug at [1]. You can see the
 conversation I had with Paul in that bug report.
 
 Regards,
 
 Daniel Lintott

I have tested your tool for Lazarus and it looks working as expected.

I'd recommend to use this solution in [2] as it looks really easy to
maintain/update with so few php lines. Also I don't know how much time
it will took SF.net to come with a solution for [3].

BTW, do you expect one can rely on [1] and modify his wach file to use
it instead of [2] until this bug get fixed?

[1] http://alpha.serverb.co.uk/debian/sf.php/lazarus
[2] https://qa.debian.org/watch/sf.php/lazarus
[3] http://sourceforge.net/p/forge/site-support/8064/


signature.asc
Description: This is a digitally signed message part


Bug#752384: HEAnet sourceforge mirror is outdated (was Re: [Pkg-pascal-devel] Lazarus watch file is broken)

2014-07-13 Thread Daniel Lintott

Hi Abou,

On 13/07/14 12:40, Abou Al Montacir wrote:
 Hi Daniel,
 
 ...
 
 I have tested your tool for Lazarus and it looks working as expected.
 

That's always good to know!

 I'd recommend to use this solution in [2] as it looks really easy to
 maintain/update with so few php lines. Also I don't know how much time
 it will took SF.net to come with a solution for [3].
 
 BTW, do you expect one can rely on [1] and modify his wach file to use
 it instead of [2] until this bug get fixed?
 

It shouldn't be a problem to use [1], there should be enough bandwidth
to manage.

So I would say.. go ahead and use it... but I do only intend this to be
a temporary fix. How long it will take SF.net to come up with a solution
is anyone's guess! But obviously if it looks like my script will be
'long-term' it would obviously be preferable to move it to a Debian server.

 [1] http://alpha.serverb.co.uk/debian/sf.php/lazarus
 [2] https://qa.debian.org/watch/sf.php/lazarus
 [3] http://sourceforge.net/p/forge/site-support/8064/
 

Regards,

Daniel



signature.asc
Description: OpenPGP digital signature


Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-08 Thread Daniel Lintott

On 07/07/14 12:27, Paul Wise wrote:
 On Mon, 2014-07-07 at 12:15 +0100, Daniel Lintott wrote:
 
 I don't know whether this has been found/investigated before but the
 appears to be an RSS feed for each project containing the file downloads.

 So for my package VPCS, the RSS feed is at:
  https://sourceforge.net/projects/vpcs/rss
 
 Thanks, I wasn't aware of that.
 

No problem ;)

 It would seem this is a viable alternative as the links provided map to
 the SF mirror selector, so should always return a mirror with the
 correct files present.
 
 That isn't quite what we need for uscan but could be a good start, we
 would still need to process the RSS into HTML though.
 

Indeed... I have managed to replicate the functionality of the current
SF redirector using the RSS feed

Demo: http://alpha.serverb.co.uk/debian/sf.php/vpcs
Git: http://anonscm.debian.org/gitweb/?p=users/dlintott-guest/sf-rss.git

Feel free to use if you'd like (it was mainly an exercise to flex my PHP
skills again!)

 I have been in contact with the SourceForge people and they are in the
 evaluation process of creating a permanent fix for us:
 
 http://sourceforge.net/p/forge/site-support/8064/
 

Ack... That would be very nice to see. Let's hope they can come up with
a nice solution.



signature.asc
Description: OpenPGP digital signature


Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-07 Thread Daniel Lintott
Hi Paul,

I just hit this same problem with one of my packages that ifs hosted on
SF.net and the HEAnet mirror doesn't hold the latest source.

I don't know whether this has been found/investigated before but the
appears to be an RSS feed for each project containing the file downloads.

So for my package VPCS, the RSS feed is at:
https://sourceforge.net/projects/vpcs/rss

It would seem this is a viable alternative as the links provided map to
the SF mirror selector, so should always return a mirror with the
correct files present.

Regards

Daniel Lintott



signature.asc
Description: OpenPGP digital signature


Bug#752384: HEAnet sourceforge mirror is outdated

2014-07-07 Thread Paul Wise
On Mon, 2014-07-07 at 12:15 +0100, Daniel Lintott wrote:

 I don't know whether this has been found/investigated before but the
 appears to be an RSS feed for each project containing the file downloads.
 
 So for my package VPCS, the RSS feed is at:
   https://sourceforge.net/projects/vpcs/rss

Thanks, I wasn't aware of that.

 It would seem this is a viable alternative as the links provided map to
 the SF mirror selector, so should always return a mirror with the
 correct files present.

That isn't quite what we need for uscan but could be a good start, we
would still need to process the RSS into HTML though.

I have been in contact with the SourceForge people and they are in the
evaluation process of creating a permanent fix for us:

http://sourceforge.net/p/forge/site-support/8064/

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part


Bug#752384: HEAnet sourceforge mirror is outdated

2014-06-23 Thread Paul Wise
Hi HEAnet mirror operators,

The Debian project[1] QA group[2] is currently relying[3] on this
command to detect new versions of software that is packaged in Debian:

rsync -Pvan --log-file=/dev/null --list-only ftp.heanet.ie::sourceforge

We received a report that the HEAnet mirror of SourceForge files is at
least 10 days out of date. For example, compare these two URLs,
version 4.4 is missing from the HEAnet mirror:

ftp://ftp.heanet.ie/mirrors/sourceforge/s/sw/sweethome3d/SweetHome3D-source/
https://sourceforge.net/projects/sweethome3d/files/SweetHome3D-source/
https://bugs.debian.org/752384

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
https://lists.debian.org/CAKTje6GDrDSC3b6yHr2LNhk2F9YwBnW5=cidlb3l9epntuf...@mail.gmail.com



Bug#752384: HEAnet sourceforge mirror is outdated

2014-06-23 Thread Mirror Team
On Mon, 23 Jun 2014 17:23:35 +0800
Paul Wise p...@debian.org wrote:

 Hi HEAnet mirror operators,
 
 The Debian project[1] QA group[2] is currently relying[3] on this
 command to detect new versions of software that is packaged in Debian:
 
 rsync -Pvan --log-file=/dev/null --list-only
 ftp.heanet.ie::sourceforge
 
 We received a report that the HEAnet mirror of SourceForge files is at
 least 10 days out of date. For example, compare these two URLs,
 version 4.4 is missing from the HEAnet mirror:
 
 ftp://ftp.heanet.ie/mirrors/sourceforge/s/sw/sweethome3d/SweetHome3D-source/
 https://sourceforge.net/projects/sweethome3d/files/SweetHome3D-source/
 https://bugs.debian.org/752384
 

Hello Paul,

I have contacted the SourceForge mirror admins who have identified a
problem on the HEAnet end and we're currently investigating.

Meanwhile it would seem that bug #752384 is erroneous. It assumes
SourceForge sync all files to all mirrors. They do not, Sf sync
auto-selected content to mirrors and have awareness as to where the
content is from their redirector. 

The problem is on the Debian end, you need to use the SourceForge
mirror selector and not assume we, HEAnet, are a full reference source.

Hope this helps,
C.
-- 
Senior System Administrator, MNS.
HEAnet Limited | http://www.heanet.ie/
1st Floor, 5 George's Dock, IFSC, Dublin 1.
T: +353-1-6609040 | F: +353-1-6603666
Registered in Ireland, no 275301


-- 
To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20140623164128.3ac66...@heanet.ie



Bug#752384: HEAnet sourceforge mirror is outdated

2014-06-23 Thread Paul Wise
On Mon, 2014-06-23 at 16:41 +0100, HEAnet Mirror Team wrote:

 I have contacted the SourceForge mirror admins who have identified a
 problem on the HEAnet end and we're currently investigating.

Thanks.

 Meanwhile it would seem that bug #752384 is erroneous. It assumes
 SourceForge sync all files to all mirrors. They do not, Sf sync
 auto-selected content to mirrors and have awareness as to where the
 content is from their redirector. 

Hmm, we were not aware of this limitation, thanks for the info.

 The problem is on the Debian end, you need to use the SourceForge
 mirror selector and not assume we, HEAnet, are a full reference source.

Sadly, right now using HEAnet is the best way to get what we need.

Our tool for checking versions of upstream software (called uscan,
documentation link below) relies on a HTML page with links to all
released files (source tarballs etc) available for the upstream software
in question. We detect version numbers by looking at filenames in link
hrefs. For most projects that suffices. SourceForge does not currently
provide such a HTML page and over the years has gotten worse. Initially
in 2005 we just redirected to the HEAnet mirror's web interface. Over
the years we flipped about between different SourceForge mirrors finding
ones that allowed HTML access to the full file listing. In 2009 changes
in the SourceForge download system forced us to move from that strategy
to downloading the complete list of all files from all SourceForge
projects from HEAnet via rsync. Despite being fairly hacky this has
worked fairly well and there have been very few complaints from Debian
developers about it. Clearly it isn't the optimal strategy so we would
be very grateful if you could forward this mail to the SourceForge
admins, it would be great for the SourceForge infrastructure itself to
provide what we need at something like the URL below.

http://manpages.debian.org/man0/uscan
https://sourceforge.net/projects/$project/files/all

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


signature.asc
Description: This is a digitally signed message part