Bug#752384: HEAnet sourceforge mirror is outdated
Hi Paul, On 11/09/14 04:20, Paul Wise wrote: On Tue, Jul 22, 2014 at 5:29 PM, Daniel Lintott wrote: I shall drop another version of the patch to the bug report that reverts the custom caching mechanism. A yak shaving exercise reminded me that this hasn't been done yet, could you please send the updated patch? Apologies for not getting this done sooner, got tied up with some other projects. Attached is a new patch against the old sf.wml from SVN, that doesn't include any caching mechanism. Regards Daniel --- ../sf-redirect-old/sf.wml 2014-07-21 19:24:00.835216162 +0100 +++ sf.wml 2014-09-11 11:46:00.277558546 +0100 @@ -1,21 +1,10 @@ ?php - -$data_dir = '/srv/qa.debian.org/data/watch'; - // need to strip leading slash, sf.net doesn't like double slashes $project=ltrim($_SERVER['PATH_INFO'], '/'); if (!$project) { -header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan'); -exit; -} - -$fdb = $data_dir . '/sf-list.db'; - -if (!file_exists($fdb)) { -header('HTTP/1.0 500 Internal Server Error'); -die('The files database is not available. Please report this message to'. - ' debian-qa@lists.debian.org'); + header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan'); + exit; } // $project is not a file and doesn't have trailing slash @@ -29,40 +18,31 @@ exit; } -$db = dba_open($fdb, 'r', 'db4'); - -if (!dba_exists($project, $db)) { -header('HTTP/1.0 404 File Not Found'); -die('There is no information about the '.$project.' project.'); -} +$xml_url = https://sourceforge.net/projects/$project/rss;; -?html +$xml = simplexml_load_file($xml_url, 'SimpleXMLElement', LIBXML_NOCDATA); +$title = $xml-channel[0]-title; +$files = $xml-channel[0]-item; +? +html head -titleFile listing for project ?php echo htmlspecialchars($project); ?/title +titleFile listing for project ?php echo $title; ?/title /head body p -h1File listing for project ?php echo htmlspecialchars($project); ?/h1 -Visit a href=http://sf.net/projects/?php echo htmlspecialchars($project); ??php echo -htmlspecialchars($project); ?'s project page/a.br/br/ +h1File listing for project ?php echo $title; ?/h1 +Visit a href=http://sf.net/projects/?php echo $project; ??php echo $project; ?'s project page/a.brbr ?php -echo dba_fetch($project, $db); +foreach ($files as $item) { + $file = basename($item-title); + $link = $_SERVER['SCRIPT_NAME'] . /$project/$file; + echo a href='$file'$file/abr\n; +} ? /p p -Thanks to a href=http://ftp.heanet.ie/;HEAnet's mirror service/a -for being the source of data for this service. -/p -p Get the source code: a href=svn://anonscm.debian.org/svn/qa/trunk/wml/watchcheckout SVN repository/a #124; a href=http://anonscm.debian.org/viewvc/qa/trunk/wml/watch/;browse SVN repository/a /p -p Last database update: -?php echo date(DATE_RFC822, filemtime($fdb)); ? -/p /body -/html?php - -dba_close($db); - -? +/html signature.asc Description: OpenPGP digital signature
Bug#752384: HEAnet sourceforge mirror is outdated
On Thu, Sep 11, 2014 at 9:53 PM, Paul Wise wrote: Thanks! Committed and made live: Daniel, there is one bug I'm hoping you can help with since I've mostly forgotten how to write PHP. URLs like this: https://qa.debian.org/watch/sf.php/chromium-bsu Need to be redirected to URLs like this: https://qa.debian.org/watch/sf.php/chromium-bsu/ Otherwise the links within the page will go to the wrong place. -- bye, pabs https://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAKTje6GvREw5Fer7YF4=uzYFwJ26ze5XRF+zNJA=kto3uw-...@mail.gmail.com
Bug#752384: HEAnet sourceforge mirror is outdated
Hi Paul, On 11/09/14 15:21, Paul Wise wrote: On Thu, Sep 11, 2014 at 9:53 PM, Paul Wise wrote: Thanks! Committed and made live: Daniel, there is one bug I'm hoping you can help with since I've mostly forgotten how to write PHP. URLs like this: https://qa.debian.org/watch/sf.php/chromium-bsu Need to be redirected to URLs like this: https://qa.debian.org/watch/sf.php/chromium-bsu/ Otherwise the links within the page will go to the wrong place. I've attached a patch which should solve this problem, in fact it was already in my script just not used as you'll see. I've tested the patch locally and it seems to function okay for URLs both with and without a trailing slash Any problems let me know Regards Daniel Use the link which has been generated from the project and file name. This avoids complications with a URL having (or not having) a trailing slash. Index: sf.wml === --- sf.wml (revision 3259) +++ sf.wml (working copy) @@ -36,7 +36,7 @@ foreach ($files as $item) { $file = basename($item-title); $link = $_SERVER['SCRIPT_NAME'] . /$project/$file; - echo a href='$file'$file/abr\n; + echo a href='$link'$file/abr\n; } ? /p signature.asc Description: OpenPGP digital signature
Bug#752384: HEAnet sourceforge mirror is outdated
On Fri, Sep 12, 2014 at 12:59 AM, Daniel Lintott wrote: I've attached a patch which should solve this problem, in fact it was already in my script just not used as you'll see. Excellent, applied! BTW: we could always use more help with QA infra, more details here :) https://wiki.debian.org/qa.debian.org -- bye, pabs https://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/caktje6eon8gxert6xpfpgzxwj3ry8yu9i41ee+rqupctkog...@mail.gmail.com
Bug#752384: HEAnet sourceforge mirror is outdated
On Tue, Jul 22, 2014 at 5:29 PM, Daniel Lintott wrote: I would say that seems like a sensible suggestion. Using a ready-made system for the caching obviously has major benefits of being well-established and supported. Personally I've never used or setup such a service, but there are several available: nginx[1]; trafficserver[2]; haproxy[3] I expect we will just go with mod_cache from apache2, seems the simplest option. I shall drop another version of the patch to the bug report that reverts the custom caching mechanism. A yak shaving exercise reminded me that this hasn't been done yet, could you please send the updated patch? -- bye, pabs https://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAKTje6FcABA=CQYd+6Dc+vKE9r5zVCXS1SZGvELyRG2i=2o...@mail.gmail.com
Bug#752384: HEAnet sourceforge mirror is outdated
On 22/07/14 02:05, Paul Wise wrote: On Tue, Jul 22, 2014 at 2:50 AM, Daniel Lintott wrote: Okay... It took a bit of thinking of how to work it, but I've come up with a working solution that caches the file list for each project requested. There was some discussion on IRC about the problem and a caching proxy was suggested instead: sf rss = rss to html converter = caching proxy = uscan Thinking about it some more, caching the output makes more sense than caching filenames and using a HTTP caching proxy is the usual way to cache HTML/websites so it might be best to just do that. Thoughts? I would say that seems like a sensible suggestion. Using a ready-made system for the caching obviously has major benefits of being well-established and supported. Personally I've never used or setup such a service, but there are several available: nginx[1]; trafficserver[2]; haproxy[3] [1] https://packages.debian.org/wheezy/nginx [2] https://packages.debian.org/wheezy/trafficserver [3] https://packages.debian.org/wheezy-backports/haproxy I shall drop another version of the patch to the bug report that reverts the custom caching mechanism. Regards Daniel signature.asc Description: OpenPGP digital signature
Bug#752384: HEAnet sourceforge mirror is outdated
On 21/07/14 04:11, Paul Wise wrote: Unfortunately the final word on that is that Debian needs to replace our current redirector with one based on the RSS feature and also to add a cache mechanism (they suggested a 1 hour cache time) so that we don't overload the RSS feature. Do you think it would be possible for you to add a caching mechanism and convert your version into a patch that we can apply to SVN? It should definitely be possible to add a caching mechanism to the the new redirector, currently I have a couple of ideas on this but both have drawbacks. 1. Use a Berkeley DB to store the retrieved data, similar to what is currently done. Something like: | project | file | update_time | | | | | Problems I foresee: * My intention would be to check at the time the script is requested if the update_time 1 hour ago... - if yes... get the new RSS and update the DB - if no use the information from the DB What happens if this happens for multiple requests at the same time? 2. Save the XML file to a cache folder Then at request time check the time on that file and it's age. The only problem I can see this causing is disk space (I don't know how much of an issue this is for Debian) The RSS file for the VPCS project is almost 52KB. Picking a figure out of the air (as I've no idea how packages use the redirector) of 1, this is going to create 520MB of cached files. Obviously some projects may have a smaller RSS and others larger which may skew that estimate. Any comments on these would be most appreciated. Cheers, Daniel signature.asc Description: OpenPGP digital signature
Re: Bug#752384: HEAnet sourceforge mirror is outdated
Hi Daniel, many thanks for your work on this! It should definitely be possible to add a caching mechanism to the the new redirector, currently I have a couple of ideas on this but both have drawbacks. 1. Use a Berkeley DB to store the retrieved data, similar to what is currently done. BDB may not be a particularly good choice either [1] -- there are other DBs suggested in that thread that seem to have a better future. https://lists.debian.org/debian-devel/2014/06/msg00328.html Problems I foresee: * My intention would be to check at the time the script is requested if the update_time 1 hour ago... - if yes... get the new RSS and update the DB - if no use the information from the DB What happens if this happens for multiple requests at the same time? If the update of the db is atomic (which is easy to arrange for most DBs), then I'd be tempted to ignore that you might very occasionally request the same RSS feed twice in quick succession. Others may disagree with me here... but I'd worry about that later if it is actually a problem. An alternative to worrying about locking would be to only update the db from cron. This adds some latency to the scan which is annoying for the maintainer sometimes but not really an issue from the QA perspective. If the RSS feed offers a Last-Modified header for HEAD requests, then the cron job can be done easily and often (perhaps that should be investigated anyway?). 2. Save the XML file to a cache folder Then at request time check the time on that file and it's age. The only problem I can see this causing is disk space (I don't know how much of an issue this is for Debian) Rather than keeping all of them all the time, you could delete them as soon as they are older than the refresh time; that could be done with find from cron. If we were to split the QA cron jobs that check for outdated sources across the day, it would be easy to keep that number down to 10-20% of the total XML. The RSS file for the VPCS project is almost 52KB. Picking a figure out of the air (as I've no idea how packages use the redirector) of 1, this is going to create 520MB of cached files. Just to help with one of the two random numbers: udd= select count(*) from upstream where watch_file like '%sf.net/%'; count --- 2409 (1 row) hope that helps! cheers Stuart -- Stuart Prescotthttp://www.nanonanonano.net/ stu...@nanonanonano.net Debian Developer http://www.debian.org/ stu...@debian.org GPG fingerprint90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7 -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/lqisau$53r$1...@ger.gmane.org
Bug#752384: HEAnet sourceforge mirror is outdated
Hi All, On Mon, 2014-07-21 at 11:02 +0100, Daniel Lintott wrote: On 21/07/14 04:11, Paul Wise wrote: Unfortunately the final word on that is that Debian needs to replace our current redirector with one based on the RSS feature and also to I honestly was expecting such answer! add a cache mechanism (they suggested a 1 hour cache time) so that we don't overload the RSS feature. Are we really consuming so much bandwidth for that feature? I assume this will happen each time a user or a daemon wants to check a particular package. I'm not convinced this is worth especially they ask for a cache of 1 hour, do we expect that per package we do a check more than twice per day (daily daemon + random user) Do you think it would be possible for you to add a caching mechanism and convert your version into a patch that we can apply to SVN? It should definitely be possible to add a caching mechanism to the the new redirector, currently I have a couple of ideas on this but both have drawbacks. 1. Use a Berkeley DB to store the retrieved data, similar to what is currently done. Something like: | project | file | update_time | | | | | Problems I foresee: * My intention would be to check at the time the script is requested if the update_time 1 hour ago... - if yes... get the new RSS and update the DB - if no use the information from the DB What happens if this happens for multiple requests at the same time? 2. Save the XML file to a cache folder I like this one Then at request time check the time on that file and it's age. The only problem I can see this causing is disk space (I don't know how much of an issue this is for Debian) The RSS file for the VPCS project is almost 52KB. Picking a figure out of the air (as I've no idea how packages use the redirector) of 1, this is going to create 520MB of cached files. Obviously some projects may have a smaller RSS and others larger which may skew that estimate. Any comments on these would be most appreciated. What about compressing the files? This can reduce the size dramatically. Can you please check for the file you used as example? Cheers, Abou Al Montacir signature.asc Description: This is a digitally signed message part
Bug#752384: HEAnet sourceforge mirror is outdated
On Mon, 2014-07-21 at 15:39 +0200, Abou Al Montacir wrote: Are we really consuming so much bandwidth for that feature? I assume this will happen each time a user or a daemon wants to check a particular package. I'm not convinced this is worth especially they ask for a cache of 1 hour, do we expect that per package we do a check more than twice per day (daily daemon + random user) I told them the average usage based on the stats from qa.d.o Apache logs (up to 30K requests per day) and said that was a bit high and asked us to implement a cache. What about compressing the files? This can reduce the size dramatically. Can you please check for the file you used as example? Seems pointless to store the raw RSS, best extract the filenames and store them in a database instead. -- bye, pabs http://wiki.debian.org/PaulWise signature.asc Description: This is a digitally signed message part
Bug#752384: HEAnet sourceforge mirror is outdated
Control: -1 + patch On 21/07/14 14:58, Paul Wise wrote: On Mon, 2014-07-21 at 15:39 +0200, Abou Al Montacir wrote: Are we really consuming so much bandwidth for that feature? I assume this will happen each time a user or a daemon wants to check a particular package. I'm not convinced this is worth especially they ask for a cache of 1 hour, do we expect that per package we do a check more than twice per day (daily daemon + random user) I told them the average usage based on the stats from qa.d.o Apache logs (up to 30K requests per day) and said that was a bit high and asked us to implement a cache. That doesn't surprise me in the least! GetDeb actually switched to using my test redirector and in 5 days I logged nearly 32000 hits at my server... each of which would have been passed to sf.net (this was quickly resolved though). What about compressing the files? This can reduce the size dramatically. Can you please check for the file you used as example? Seems pointless to store the raw RSS, best extract the filenames and store them in a database instead. Okay... It took a bit of thinking of how to work it, but I've come up with a working solution that caches the file list for each project requested. I am storing each projects' file list in a separate Berkeley DB so we can check the file modification time and only update when the file is older than the cache limit ($cache_time) in seconds (currently 3600 seconds). Currently it is configured to store these files in a subdirectory of cache ($cache_dir), which will need to be writeable by the web server. Otherwise I don't think there is anything else particularly special to report. I have updated my test server and it is now running the latest version of the script. Regards, Daniel --- ../sf-redirect-old/sf.wml 2014-07-21 19:24:00.835216162 +0100 +++ sf.wml 2014-07-21 19:45:21.683113723 +0100 @@ -1,21 +1,12 @@ ?php - -$data_dir = '/srv/qa.debian.org/data/watch'; - // need to strip leading slash, sf.net doesn't like double slashes $project=ltrim($_SERVER['PATH_INFO'], '/'); +$cache_dir = './cache'; +$cache_time = 3600; if (!$project) { -header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan'); -exit; -} - -$fdb = $data_dir . '/sf-list.db'; - -if (!file_exists($fdb)) { -header('HTTP/1.0 500 Internal Server Error'); -die('The files database is not available. Please report this message to'. - ' debian-qa@lists.debian.org'); + header('Location: http://manpages.debian.net/cgi-bin/man.cgi?query=uscan'); + exit; } // $project is not a file and doesn't have trailing slash @@ -29,40 +20,60 @@ exit; } -$db = dba_open($fdb, 'r', 'db4'); +$db_file = $cache_dir/$project.db; -if (!dba_exists($project, $db)) { -header('HTTP/1.0 404 File Not Found'); -die('There is no information about the '.$project.' project.'); +if (file_exists($db_file) and time() - filemtime($db_file) $cache_time ) { + # Open the db_file for reading + $db = dba_open($db_file, 'r', 'db4'); +} else { +$xml_url = https://sourceforge.net/projects/$project/rss;; + # Update/create the db_file, then read it's contents + # Load the rss feed using simplexml + $xml = @simplexml_load_file($xml_url, 'SimpleXMLElement', LIBXML_NOCDATA); + if ($xml === false) { + echo No project named $project could be found, check the project name and try again; + exit; + } else { + # Get an array of files from the XML + $files = $xml-channel[0]-item; + # Create a new db file + $db = dba_open($db_file . '-new', 'c', 'db4'); + # Add the file list to the db + $i = 0; + foreach ($files as $item) { + dba_insert($i, basename($item-title),$db); + $i++; + } + dba_close($db); + rename($db_file . '-new', $db_file); + $db = dba_open($db_file, 'r', 'db4'); + } } - -?html +? +html head titleFile listing for project ?php echo htmlspecialchars($project); ?/title /head body p h1File listing for project ?php echo htmlspecialchars($project); ?/h1 -Visit a href=http://sf.net/projects/?php echo htmlspecialchars($project); ??php echo -htmlspecialchars($project); ?'s project page/a.br/br/ +Visit a href=http://sf.net/projects/?php echo htmlspecialchars($project); ??php +echo htmlspecialchars($project); ?'s project page/a.brbr ?php -echo dba_fetch($project, $db); +$key = dba_firstkey($db); +while ($key !== False) { + $file = dba_fetch($key, $db); + $link = $_SERVER['SCRIPT_NAME'] . /$project/$file; + echo a href='$link'$file/abr\n; + $key = dba_nextkey($db); +} ? /p -p -Thanks to a href=http://ftp.heanet.ie/;HEAnet's mirror service/a -for being the source of data for this service. -/p +pLast database update: ?php echo date(DATE_RFC822, filemtime($db_file)); ?/p p Get the source code: a href=svn://anonscm.debian.org/svn/qa/trunk/wml/watchcheckout SVN repository/a #124; a href=http://anonscm.debian.org/viewvc/qa/trunk/wml/watch/;browse SVN repository/a /p -p Last database update: -?php echo date(DATE_RFC822, filemtime($fdb)); ?
Bug#752384: HEAnet sourceforge mirror is outdated
On Tue, Jul 22, 2014 at 2:50 AM, Daniel Lintott wrote: Okay... It took a bit of thinking of how to work it, but I've come up with a working solution that caches the file list for each project requested. There was some discussion on IRC about the problem and a caching proxy was suggested instead: sf rss = rss to html converter = caching proxy = uscan Thinking about it some more, caching the output makes more sense than caching filenames and using a HTTP caching proxy is the usual way to cache HTML/websites so it might be best to just do that. Thoughts? -- bye, pabs https://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/caktje6fxeifmyc2vvkn83bc9h9-b9pnw-vkggg-jb3sjazl...@mail.gmail.com
Bug#752384: HEAnet sourceforge mirror is outdated
On Tue, Jul 8, 2014 at 5:26 PM, Daniel Lintott wrote: Indeed... I have managed to replicate the functionality of the current SF redirector using the RSS feed ... Ack... That would be very nice to see. Let's hope they can come up with a nice solution. Unfortunately the final word on that is that Debian needs to replace our current redirector with one based on the RSS feature and also to add a cache mechanism (they suggested a 1 hour cache time) so that we don't overload the RSS feature. Do you think it would be possible for you to add a caching mechanism and convert your version into a patch that we can apply to SVN? -- bye, pabs https://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAKTje6H-A0R4zLvETOTG4u+-yz6zgdD_Wre=j0mlhmo2+kd...@mail.gmail.com
Bug#752384: HEAnet sourceforge mirror is outdated (was Re: [Pkg-pascal-devel] Lazarus watch file is broken)
Hi Daniel, On Sun, 2014-07-13 at 11:21 +0100, Daniel Lintott wrote: Hi, ... I think your actually the following the bug at [1]. You can see the conversation I had with Paul in that bug report. Regards, Daniel Lintott I have tested your tool for Lazarus and it looks working as expected. I'd recommend to use this solution in [2] as it looks really easy to maintain/update with so few php lines. Also I don't know how much time it will took SF.net to come with a solution for [3]. BTW, do you expect one can rely on [1] and modify his wach file to use it instead of [2] until this bug get fixed? [1] http://alpha.serverb.co.uk/debian/sf.php/lazarus [2] https://qa.debian.org/watch/sf.php/lazarus [3] http://sourceforge.net/p/forge/site-support/8064/ signature.asc Description: This is a digitally signed message part
Bug#752384: HEAnet sourceforge mirror is outdated (was Re: [Pkg-pascal-devel] Lazarus watch file is broken)
Hi Abou, On 13/07/14 12:40, Abou Al Montacir wrote: Hi Daniel, ... I have tested your tool for Lazarus and it looks working as expected. That's always good to know! I'd recommend to use this solution in [2] as it looks really easy to maintain/update with so few php lines. Also I don't know how much time it will took SF.net to come with a solution for [3]. BTW, do you expect one can rely on [1] and modify his wach file to use it instead of [2] until this bug get fixed? It shouldn't be a problem to use [1], there should be enough bandwidth to manage. So I would say.. go ahead and use it... but I do only intend this to be a temporary fix. How long it will take SF.net to come up with a solution is anyone's guess! But obviously if it looks like my script will be 'long-term' it would obviously be preferable to move it to a Debian server. [1] http://alpha.serverb.co.uk/debian/sf.php/lazarus [2] https://qa.debian.org/watch/sf.php/lazarus [3] http://sourceforge.net/p/forge/site-support/8064/ Regards, Daniel signature.asc Description: OpenPGP digital signature
Bug#752384: HEAnet sourceforge mirror is outdated
On 07/07/14 12:27, Paul Wise wrote: On Mon, 2014-07-07 at 12:15 +0100, Daniel Lintott wrote: I don't know whether this has been found/investigated before but the appears to be an RSS feed for each project containing the file downloads. So for my package VPCS, the RSS feed is at: https://sourceforge.net/projects/vpcs/rss Thanks, I wasn't aware of that. No problem ;) It would seem this is a viable alternative as the links provided map to the SF mirror selector, so should always return a mirror with the correct files present. That isn't quite what we need for uscan but could be a good start, we would still need to process the RSS into HTML though. Indeed... I have managed to replicate the functionality of the current SF redirector using the RSS feed Demo: http://alpha.serverb.co.uk/debian/sf.php/vpcs Git: http://anonscm.debian.org/gitweb/?p=users/dlintott-guest/sf-rss.git Feel free to use if you'd like (it was mainly an exercise to flex my PHP skills again!) I have been in contact with the SourceForge people and they are in the evaluation process of creating a permanent fix for us: http://sourceforge.net/p/forge/site-support/8064/ Ack... That would be very nice to see. Let's hope they can come up with a nice solution. signature.asc Description: OpenPGP digital signature
Bug#752384: HEAnet sourceforge mirror is outdated
Hi Paul, I just hit this same problem with one of my packages that ifs hosted on SF.net and the HEAnet mirror doesn't hold the latest source. I don't know whether this has been found/investigated before but the appears to be an RSS feed for each project containing the file downloads. So for my package VPCS, the RSS feed is at: https://sourceforge.net/projects/vpcs/rss It would seem this is a viable alternative as the links provided map to the SF mirror selector, so should always return a mirror with the correct files present. Regards Daniel Lintott signature.asc Description: OpenPGP digital signature
Bug#752384: HEAnet sourceforge mirror is outdated
On Mon, 2014-07-07 at 12:15 +0100, Daniel Lintott wrote: I don't know whether this has been found/investigated before but the appears to be an RSS feed for each project containing the file downloads. So for my package VPCS, the RSS feed is at: https://sourceforge.net/projects/vpcs/rss Thanks, I wasn't aware of that. It would seem this is a viable alternative as the links provided map to the SF mirror selector, so should always return a mirror with the correct files present. That isn't quite what we need for uscan but could be a good start, we would still need to process the RSS into HTML though. I have been in contact with the SourceForge people and they are in the evaluation process of creating a permanent fix for us: http://sourceforge.net/p/forge/site-support/8064/ -- bye, pabs http://wiki.debian.org/PaulWise signature.asc Description: This is a digitally signed message part
Bug#752384: HEAnet sourceforge mirror is outdated
Hi HEAnet mirror operators, The Debian project[1] QA group[2] is currently relying[3] on this command to detect new versions of software that is packaged in Debian: rsync -Pvan --log-file=/dev/null --list-only ftp.heanet.ie::sourceforge We received a report that the HEAnet mirror of SourceForge files is at least 10 days out of date. For example, compare these two URLs, version 4.4 is missing from the HEAnet mirror: ftp://ftp.heanet.ie/mirrors/sourceforge/s/sw/sweethome3d/SweetHome3D-source/ https://sourceforge.net/projects/sweethome3d/files/SweetHome3D-source/ https://bugs.debian.org/752384 -- bye, pabs http://wiki.debian.org/PaulWise -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAKTje6GDrDSC3b6yHr2LNhk2F9YwBnW5=cidlb3l9epntuf...@mail.gmail.com
Bug#752384: HEAnet sourceforge mirror is outdated
On Mon, 23 Jun 2014 17:23:35 +0800 Paul Wise p...@debian.org wrote: Hi HEAnet mirror operators, The Debian project[1] QA group[2] is currently relying[3] on this command to detect new versions of software that is packaged in Debian: rsync -Pvan --log-file=/dev/null --list-only ftp.heanet.ie::sourceforge We received a report that the HEAnet mirror of SourceForge files is at least 10 days out of date. For example, compare these two URLs, version 4.4 is missing from the HEAnet mirror: ftp://ftp.heanet.ie/mirrors/sourceforge/s/sw/sweethome3d/SweetHome3D-source/ https://sourceforge.net/projects/sweethome3d/files/SweetHome3D-source/ https://bugs.debian.org/752384 Hello Paul, I have contacted the SourceForge mirror admins who have identified a problem on the HEAnet end and we're currently investigating. Meanwhile it would seem that bug #752384 is erroneous. It assumes SourceForge sync all files to all mirrors. They do not, Sf sync auto-selected content to mirrors and have awareness as to where the content is from their redirector. The problem is on the Debian end, you need to use the SourceForge mirror selector and not assume we, HEAnet, are a full reference source. Hope this helps, C. -- Senior System Administrator, MNS. HEAnet Limited | http://www.heanet.ie/ 1st Floor, 5 George's Dock, IFSC, Dublin 1. T: +353-1-6609040 | F: +353-1-6603666 Registered in Ireland, no 275301 -- To UNSUBSCRIBE, email to debian-qa-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20140623164128.3ac66...@heanet.ie
Bug#752384: HEAnet sourceforge mirror is outdated
On Mon, 2014-06-23 at 16:41 +0100, HEAnet Mirror Team wrote: I have contacted the SourceForge mirror admins who have identified a problem on the HEAnet end and we're currently investigating. Thanks. Meanwhile it would seem that bug #752384 is erroneous. It assumes SourceForge sync all files to all mirrors. They do not, Sf sync auto-selected content to mirrors and have awareness as to where the content is from their redirector. Hmm, we were not aware of this limitation, thanks for the info. The problem is on the Debian end, you need to use the SourceForge mirror selector and not assume we, HEAnet, are a full reference source. Sadly, right now using HEAnet is the best way to get what we need. Our tool for checking versions of upstream software (called uscan, documentation link below) relies on a HTML page with links to all released files (source tarballs etc) available for the upstream software in question. We detect version numbers by looking at filenames in link hrefs. For most projects that suffices. SourceForge does not currently provide such a HTML page and over the years has gotten worse. Initially in 2005 we just redirected to the HEAnet mirror's web interface. Over the years we flipped about between different SourceForge mirrors finding ones that allowed HTML access to the full file listing. In 2009 changes in the SourceForge download system forced us to move from that strategy to downloading the complete list of all files from all SourceForge projects from HEAnet via rsync. Despite being fairly hacky this has worked fairly well and there have been very few complaints from Debian developers about it. Clearly it isn't the optimal strategy so we would be very grateful if you could forward this mail to the SourceForge admins, it would be great for the SourceForge infrastructure itself to provide what we need at something like the URL below. http://manpages.debian.org/man0/uscan https://sourceforge.net/projects/$project/files/all -- bye, pabs http://wiki.debian.org/PaulWise signature.asc Description: This is a digitally signed message part