Re: [gentoo-dev] Package ranking by number of ebuilds on the portage tree

2012-10-26 Thread Corentin Chary
On Fri, Oct 26, 2012 at 1:16 PM, Theo Chatzimichos tampak...@gentoo.org wrote:
 On Fri, Oct 26, 2012 at 2:02 PM, Francisco Blas Izquierdo Riera
 (klondike) klond...@gentoo.org wrote:
 So I have been doing some bash scripting out of some comment in a
 conversation to count (and rank) the packages by the number of ebuilds
 they have (and thus of versions of said package). The results can be
 seen at http://dev.gentoo.org/~klondike/ebuildrank.txt and if there is
 interest I can try to automate the generation of the ranking daily
 (though I'd like infra's comments on that).

 We can put the script in qa-reports.g.o

I think that kind of stats would be really easy to generate within the
new p.g.o.
Probably with a single database request.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] SRC_URI in metadata.xml

2012-08-11 Thread Corentin Chary
On Fri, Aug 10, 2012 at 10:12 PM, Diego Elio Pettenò
flamee...@flameeyes.eu wrote:
 On 10/08/2012 13:05, Corentin Chary wrote:
 Right, our proposal is not here to replace SRC_URI, it's here to fix
 the cases where SRC_URI can't be sanely used to guess new upstream
 versions (strange mangling rules, unbrowsable directories, etc...).

 Yes I guess Jeroen was just saying why we shouldn't abandon it as Gilles
 proposed.

 FWIW for the rest it feels right to me. Although this starts to add up
 to the reasons why at least metadata.xml should be validated by schema,
 and not DTD.

Maybe .. We plan to use watch xmlns=http://euscan.iksaif.net; to
avoid editing metadata.dtd (for now).
What do you think about format propositions ? Current format looks
like what was given in the examples, but mgorny feels that something
more xmlish would be better.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] [RFC] euscan: Need to add more upstream info in metadata.xml

2012-08-10 Thread Corentin Chary
On Fri, Aug 10, 2012 at 2:03 PM, Gilles Dartiguelongue e...@gentoo.org wrote:
 Having done some debian packaging for work, I find watch files from
 debian really helpful. Changing the format to a XML compatible one does
 not seem like a hard work so I'll probably leave that up for others to
 discuss.

 Since you are proposing this, a side question is:
 Why should we write SRC_URI in ebuilds if that info is now available in
 metadata.xml ? (granted that we might still want to keep over-riding
 this information in ebuilds)

It's not (only) SRC_URI, sometime it's completly different, sometimes
watch would contain only versionmangle since SRC_URI contains
enought informations for euscan... SRC_URI serves a totally different
purpose :).

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] SRC_URI in metadata.xml

2012-08-10 Thread Corentin Chary
On Fri, Aug 10, 2012 at 4:21 PM, Jeroen Roovers j...@gentoo.org wrote:
 On Fri, 10 Aug 2012 14:03:23 +0200
 Gilles Dartiguelongue e...@gentoo.org wrote:

 Since you are proposing this, a side question is:
 Why should we write SRC_URI in ebuilds if that info is now available
 in metadata.xml ? (granted that we might still want to keep
 over-riding this information in ebuilds)

 1) The information in metadata.xml is inaccurate, it's a hint. When it
fails, nothing of value is lost since the ebuild (supposedly) has
what you want.
 2) SRC_URI is precise.
 3) SRC_URI can change over time, and across versions (even with all the
variables in place).
 4) Backward compatibility.
 5) The inversion of your question: Why should we start handling SRC_URI
outside ebuilds and eclasses? Or, how would that be practical,
advantageous, an improvement on the current situation.

Right, our proposal is not here to replace SRC_URI, it's here to fix
the cases where SRC_URI can't be sanely used to guess new upstream
versions (strange mangling rules, unbrowsable directories, etc...).

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] euscan GSoC project - requesting feedback

2012-06-29 Thread Corentin Chary
On Fri, Jun 29, 2012 at 4:07 AM, Kent Fredric kentfred...@gmail.com wrote:
 On 27 June 2012 19:51, Federico fox Scrinzi fo...@anche.no wrote:
 The main question is: what would you like to have on this dashboard?
 Currently (in the development version) there's the possibility to login
 and watch/unwatch packages/categories/herds/... and see the watched
 stuff in the account dashboard. We're planning on implementing a
 weekly(?) custom newsletter based on the packages you're watching, which
 features would you like?

 The project repo for the GSoC is here: https://github.com/volpino/euscan

 Thanks!


 For the most part it seems to get upstream / portage versioning right,
 but occasionally you get miss-matches for some reason.

 It would be nice to allow to provide some mapping mechanism that
 existed on the overlay itself to inform euscan how to map upstream
 versions to downstream ones, but implementing that would be far too
 complex I feel.

 Instead, it would be nice to have a mechanism in the interface to set
 a Upstream version is value for each package if euscan can't tell.

 Ie:

 http://euscan.iksaif.net/package/dev-perl/HTML-TreeBuilder-LibXML/

 Upstream is 0.71 , portage is ( normalised ) to 0.710.0 , and these
 are in fact the same version. So in 0.710.0 , it would be nice to be
 able to set the upstream version manually to 0.71 so that euscan no
 longer reported it as outdated.

 http://euscan.iksaif.net/package/dev-perl/Authen-SASL-Cyrus-server/

 0.13 == 0.13-serve

 http://euscan.iksaif.net/package/dev-perl/Module-Extract-Namespaces/

 0.140.200_rc == 0.14_0.2

 http://euscan.iksaif.net/package/dev-perl/Math-BaseCnv/
 http://euscan.iksaif.net/package/dev-perl/XML-Tidy/

 1.8 == 1.8.B59BrZ
 1.8 == 1.8.B2AMvd

 ( Upstream for those 2 packages have a versioning scheme tantamount to
 intolerable cruelty.
 https://rt.cpan.org/Public/Bug/Display.html?id=60275 )

 http://euscan.iksaif.net/package/dev-perl/Perl-Critic-Moose/
 0.999.2_rc == 0.999._002

 http://euscan.iksaif.net/package/dev-perl/aliased/
 0.300.100_rc == 0.30_0.1

 http://euscan.iksaif.net/package/dev-perl/EV/
 4.110.0  == 4.11


 --
 Kent

 perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

 http://kent-fredric.fox.geek.nz


Something that could help with that: we plan to add something like
debian/watch in metadata.xml. This should allow to specify a regexp to
mangle versions (sed like syntax: /match/replace/). For some specific
packages, we also already have specific handlers (cpan, php, pypi,
etc...) which have specific version mangling functions.



Re: [gentoo-dev] euscan GSoC project - requesting feedback

2012-06-29 Thread Corentin Chary
On Fri, Jun 29, 2012 at 8:53 AM, Michał Górny mgo...@gentoo.org wrote:
 On Fri, 29 Jun 2012 14:07:58 +1200
 Kent Fredric kentfred...@gmail.com wrote:

 On 27 June 2012 19:51, Federico fox Scrinzi fo...@anche.no wrote:
  The main question is: what would you like to have on this dashboard?
  Currently (in the development version) there's the possibility to
  login and watch/unwatch packages/categories/herds/... and see the
  watched stuff in the account dashboard. We're planning on
  implementing a weekly(?) custom newsletter based on the packages
  you're watching, which features would you like?
 
  The project repo for the GSoC is here:
  https://github.com/volpino/euscan
 
  Thanks!
 

 For the most part it seems to get upstream / portage versioning right,
 but occasionally you get miss-matches for some reason.

 It would be nice to allow to provide some mapping mechanism that
 existed on the overlay itself to inform euscan how to map upstream
 versions to downstream ones, but implementing that would be far too
 complex I feel.

 Instead, it would be nice to have a mechanism in the interface to set
 a Upstream version is value for each package if euscan can't tell.

 Ie:

 http://euscan.iksaif.net/package/dev-perl/HTML-TreeBuilder-LibXML/

 Upstream is 0.71 , portage is ( normalised ) to 0.710.0 , and these
 are in fact the same version. So in 0.710.0 , it would be nice to be
 able to set the upstream version manually to 0.71 so that euscan no
 longer reported it as outdated.

 I think we could actually handle perl pretty easily. I believe euscan
 will start using CPAN API to check the package versions, and we can
 embed the normalization there.

It's already the case:
https://github.com/iksaif/euscan/blob/master/pym/euscan/handlers/cpan.py
but my mangling functions are probably broken in some cases. If
somebody with a better knowledge of CPAN versionning scheme could fix
them it would be great !

Thanks,



Re: [gentoo-dev] remote-id cpan-module

2012-06-13 Thread Corentin Chary
On Sat, Jun 2, 2012 at 7:11 PM, Torsten Veller t...@gentoo.org wrote:
 * Corentin Chary iks...@gentoo.org:
 On Thu, May 17, 2012 at 2:02 AM, Kent Fredric kentfred...@gmail.com wrote:
  On 13 May 2012 07:43, Torsten Veller t...@gentoo.org wrote:
  It doesn't even list Moose for Moose?
 
  Its probably falling outside the initial 10 results, I forgot it did that,
 
  02packages.details.txt.gz lists 72 package names for Moose-2.0602.
 
 
  Need to bolt on a { size: 100 }  to the query to expand how may
  results it will return.

 Updated remotesid.py to use that, correctly add Moose in the diff now !

 metadata.dtd was updated per bug #406287 and it contains the cpan-module
 remote-id.

 The current patch for dev-perl/* is roughly 800k big:
 http://dev.gentoo.org/~tove/files/devperlremoteids.patch

 I am going to update the files in the next days.
 Now it would be a good time to voice your concerns.

 --
 Thanks
 Torsten


I pushed some fixes in remoteids.py, it now have the ability to remove
dead remote-id.
You should regenerate your patch, or use this one:
http://xf.iksaif.net/dev/gentoo/remoteids/dev-perl.diff
Note that I just made a small change that will make the remote-id
sorted (which is great for cpan-module), if you want it, just
regenerate a patch with: eix --only-names -C dev-perl | ./remoteids.py
--diff --check-all

Thanks,

--
Corentin Chary



Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-06-04 Thread Corentin Chary
On Sat, Jun 2, 2012 at 7:30 PM, Michael Weber x...@gentoo.org wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 Is there any way to verify the remote-id data?

 What programs/scripts use these fields btw?

euscan will soon, and I guess p.g.o will too.

 I could imagine a test like (i.e. remote-id
 type=sourceforge$a/remote-id )
 does http://www.fs.net/$a exist.

 Michael

That could probably be done in remoteids.py



Re: [gentoo-dev] Re: RFC: Add new remote-id types in metadata.dtd

2012-05-18 Thread Corentin Chary
On Thu, May 17, 2012 at 2:02 AM, Kent Fredric kentfred...@gmail.com wrote:
 On 13 May 2012 07:43, Torsten Veller t...@gentoo.org wrote:
 * Corentin Chary corentin.ch...@gmail.com:
 On Sat, Apr 21, 2012 at 03:33:18PM +1200, Kent Fredric wrote:
                                      { term: { status:latest} },
                                      { term: { 
  module.authorized:true}}

 What does this mean?
 - latest? this term looks like maintenance work.
 - what is authorized?

 latest means that it will fetch metadata for whatever is deemed the
 most recent non-dev release, which is really the only sane option to
 go for if you want a list of modules that currently pertain to the
 distribution.  You could request *all* releases and then find a union
 of elements ... but that would be both erroneous and very time
 consuming.

 It doesn't even list Moose for Moose?

 Its probably falling outside the initial 10 results, I forgot it did that,

 02packages.details.txt.gz lists 72 package names for Moose-2.0602.


 Need to bolt on a { size: 100 }  to the query to expand how may
 results it will return.

Updated remotesid.py to use that, correctly add Moose in the diff now !

  curl -XPOST 'http://api.metacpan.org/module/_search' -d '
 {
       fields: [
               module.name,
               release
       ],
       query: {
               constant_score: {
                       filter : {
                               and : [
                                       { term: { distribution:Moose } },
                                       { term: { status:latest} },
                                       { term: {
 mime:text/x-script.perl-module}},
                                       { term: { indexed:true}},
                                       { term: { module.authorized:true}}
                               ]

                       }
               }
       },
       size: 100
 }

 ^  that | grep module.name  | wc -l   # 83

 --
 Kent

 perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

 http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] [RFC] New third party mirrors

2012-04-27 Thread Corentin Chary
On Thu, Apr 26, 2012 at 8:41 PM, Michał Górny mgo...@gentoo.org wrote:
 On Thu, 26 Apr 2012 10:21:36 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 Second solution:
 github http://cloud.github.com/downloads
 github-bad-uris -http://github.com/downloads/
 -https://github.com/downloads/

 The good thing with the first one is that it would allow repoman to
 outputs something like you should use 'mirror://github'.

 Well, we could decide on something common and special like:

 github:bad-uris http://.

 And then let repoman suggest using mirror with ':bad-uris' stripped.

Works for me. What would be the next step to push this ?


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] [RFC] New third party mirrors

2012-04-26 Thread Corentin Chary
On Wed, Apr 25, 2012 at 6:41 PM, Michał Górny mgo...@gentoo.org wrote:
 On Wed, 25 Apr 2012 09:16:05 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 6:38 PM, Michał Górny mgo...@gentoo.org
 wrote:
  On Tue, 24 Apr 2012 16:19:11 +
  Robin H. Johnson robb...@gentoo.org wrote:
 
  On Tue, Apr 24, 2012 at 04:50:49PM +0200, Corentin Chary wrote:
$ ./mirrors.py --all --count
297 ?? ?? http://pear.php.net
297 ?? ?? http://pear.php.net/get
88 ?? ?? ??http://pecl.php.net
88 ?? ?? ??http://pecl.php.net/get
These are already mirror bouncers. If you visit the above,
you'll get the closest mirror for downloading.
   And since there is already ~10 mirrors with only one actual
   backend, should they go to thirdpartymirrors or not ? If not,
   what about this pseudo-mirrors already present in
   thirdpartymirrors ?
  I think we should add the pseudo-mirrors, but explicitly mark them
  as such in the file, so that they don't get duplicate entries
  added (eg adding us.pear, de.pear and the pear bouncer is bad.
  Should have just the bouncer).
 
  It'd be great if we could add some kind of additional mirror
  entries, which would be used by repoman to signal missing mirror://
  entries but won't be used for downloads.

 Yep, we could put that in it too:
 github                http://github.com/downloads/
 https://github.com/downloads/

 Per spec, portage can choose a random mirror of the list. If we put
 entries like that, these two will be equally possible as the preferred
 cloud. URL -- while they redirect one to another.

 We might decide on some common syntax like preceding all extra entries
 with '-' but I don't want to be the one deciding here.

I checked, and current portage code already handle entries starting
with a - gracefully thanks to stack_dictlist (removing them from the
list of mirrors).

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] [RFC] New third party mirrors

2012-04-26 Thread Corentin Chary
On Thu, Apr 26, 2012 at 9:57 AM, Zac Medico zmed...@gentoo.org wrote:
 On 04/26/2012 12:30 AM, Corentin Chary wrote:
 On Wed, Apr 25, 2012 at 6:41 PM, Michał Górny mgo...@gentoo.org wrote:
 On Wed, 25 Apr 2012 09:16:05 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 6:38 PM, Michał Górny mgo...@gentoo.org
 wrote:
 On Tue, 24 Apr 2012 16:19:11 +
 Robin H. Johnson robb...@gentoo.org wrote:

 On Tue, Apr 24, 2012 at 04:50:49PM +0200, Corentin Chary wrote:
 $ ./mirrors.py --all --count
 297 ?? ?? http://pear.php.net
 297 ?? ?? http://pear.php.net/get
 88 ?? ?? ??http://pecl.php.net
 88 ?? ?? ??http://pecl.php.net/get
 These are already mirror bouncers. If you visit the above,
 you'll get the closest mirror for downloading.
 And since there is already ~10 mirrors with only one actual
 backend, should they go to thirdpartymirrors or not ? If not,
 what about this pseudo-mirrors already present in
 thirdpartymirrors ?
 I think we should add the pseudo-mirrors, but explicitly mark them
 as such in the file, so that they don't get duplicate entries
 added (eg adding us.pear, de.pear and the pear bouncer is bad.
 Should have just the bouncer).

 It'd be great if we could add some kind of additional mirror
 entries, which would be used by repoman to signal missing mirror://
 entries but won't be used for downloads.

 Yep, we could put that in it too:
 github                http://github.com/downloads/
 https://github.com/downloads/

 Per spec, portage can choose a random mirror of the list. If we put
 entries like that, these two will be equally possible as the preferred
 cloud. URL -- while they redirect one to another.

 We might decide on some common syntax like preceding all extra entries
 with '-' but I don't want to be the one deciding here.

 I checked, and current portage code already handle entries starting
 with a - gracefully thanks to stack_dictlist (removing them from the
 list of mirrors).

 That means repoman will ignore them too. If you want existing versions
 of repoman to check for those paths in SRC_URI, you can add a line like
 this to thirdpartymirrors:

 github-bad-urls http://github.com/downloads/ https://github.com/downloads/

Hum, I checked repoman source code, and I didn't find where it checks
if SRC_URI matches something in thirdpartymirror. Any hint ?


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] [RFC] New third party mirrors

2012-04-26 Thread Corentin Chary
On Thu, Apr 26, 2012 at 10:07 AM, Zac Medico zmed...@gentoo.org wrote:
 On 04/26/2012 01:03 AM, Corentin Chary wrote:
 On Thu, Apr 26, 2012 at 9:57 AM, Zac Medico zmed...@gentoo.org wrote:
 On 04/26/2012 12:30 AM, Corentin Chary wrote:
 On Wed, Apr 25, 2012 at 6:41 PM, Michał Górny mgo...@gentoo.org wrote:
 On Wed, 25 Apr 2012 09:16:05 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 On Tue, Apr 24, 2012 at 6:38 PM, Michał Górny mgo...@gentoo.org
 wrote:
 On Tue, 24 Apr 2012 16:19:11 +
 Robin H. Johnson robb...@gentoo.org wrote:

 On Tue, Apr 24, 2012 at 04:50:49PM +0200, Corentin Chary wrote:
 $ ./mirrors.py --all --count
 297 ?? ?? http://pear.php.net
 297 ?? ?? http://pear.php.net/get
 88 ?? ?? ??http://pecl.php.net
 88 ?? ?? ??http://pecl.php.net/get
 These are already mirror bouncers. If you visit the above,
 you'll get the closest mirror for downloading.
 And since there is already ~10 mirrors with only one actual
 backend, should they go to thirdpartymirrors or not ? If not,
 what about this pseudo-mirrors already present in
 thirdpartymirrors ?
 I think we should add the pseudo-mirrors, but explicitly mark them
 as such in the file, so that they don't get duplicate entries
 added (eg adding us.pear, de.pear and the pear bouncer is bad.
 Should have just the bouncer).

 It'd be great if we could add some kind of additional mirror
 entries, which would be used by repoman to signal missing mirror://
 entries but won't be used for downloads.

 Yep, we could put that in it too:
 github                http://github.com/downloads/
 https://github.com/downloads/

 Per spec, portage can choose a random mirror of the list. If we put
 entries like that, these two will be equally possible as the preferred
 cloud. URL -- while they redirect one to another.

 We might decide on some common syntax like preceding all extra entries
 with '-' but I don't want to be the one deciding here.

 I checked, and current portage code already handle entries starting
 with a - gracefully thanks to stack_dictlist (removing them from the
 list of mirrors).

 That means repoman will ignore them too. If you want existing versions
 of repoman to check for those paths in SRC_URI, you can add a line like
 this to thirdpartymirrors:

 github-bad-urls http://github.com/downloads/ https://github.com/downloads/

 Hum, I checked repoman source code, and I didn't find where it checks
 if SRC_URI matches something in thirdpartymirror. Any hint ?

 Search for SRC_URI.mirror in /usr/bin/repoman.

Arg.. ok, I only looked in pym/repoman/.

So two solutions here:

First one:
github http://cloud.github.com/downloads -http://github.com/downloads/
-https://github.com/downloads/
+ a small patch that would allow repoman to do something like
settings.thirdpartymirrors(keep_bad_uris=True) in order to keep uris
starting with a - in the list.

Second solution:
github http://cloud.github.com/downloads
github-bad-uris -http://github.com/downloads/ -https://github.com/downloads/

The good thing with the first one is that it would allow repoman to
outputs something like you should use 'mirror://github'.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] [RFC] New third party mirrors

2012-04-25 Thread Corentin Chary
On Tue, Apr 24, 2012 at 6:38 PM, Michał Górny mgo...@gentoo.org wrote:
 On Tue, 24 Apr 2012 16:19:11 +
 Robin H. Johnson robb...@gentoo.org wrote:

 On Tue, Apr 24, 2012 at 04:50:49PM +0200, Corentin Chary wrote:
   $ ./mirrors.py --all --count
   297 ?? ?? http://pear.php.net
   297 ?? ?? http://pear.php.net/get
   88 ?? ?? ??http://pecl.php.net
   88 ?? ?? ??http://pecl.php.net/get
   These are already mirror bouncers. If you visit the above, you'll
   get the closest mirror for downloading.
  And since there is already ~10 mirrors with only one actual
  backend, should they go to thirdpartymirrors or not ? If not, what
  about this pseudo-mirrors already present in thirdpartymirrors ?
 I think we should add the pseudo-mirrors, but explicitly mark them as
 such in the file, so that they don't get duplicate entries added (eg
 adding us.pear, de.pear and the pear bouncer is bad. Should have just
 the bouncer).

 It'd be great if we could add some kind of additional mirror entries,
 which would be used by repoman to signal missing mirror:// entries but
 won't be used for downloads.

Yep, we could put that in it too:
github  http://github.com/downloads/ https://github.com/downloads/
nongnu  http://download.savannah.nongnu.org/releases/

(I'm not really sure for nongnu).


-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] [RFC] New third party mirrors

2012-04-24 Thread Corentin Chary
Hello,

As part of my portage-janitor scripts 
(https://github.com/iksaif/portage-janitor)
I created a tool that counts URIs prefixes and sort them.
Here is the result of this script on the gentoo-x86 tree:

$ ./mirrors.py --all --count
1262http://dev.gentoo.org
483 http://oss.tresys.com
482 http://oss.tresys.com/files
482 http://oss.tresys.com/files/refpolicy
473 http://xorg.freedesktop.org
473 http://xorg.freedesktop.org/releases
473 http://xorg.freedesktop.org/releases/individual
313 http://dev.gentoo.org/~swift
313 http://dev.gentoo.org/~swift/patches
309 http://dev.gentoo.org/~swift/patches/selinux-base-policy
297 http://pear.php.net
297 http://pear.php.net/get
296 http://hackage.haskell.org/packages
296 http://hackage.haskell.org/packages/archive
296 http://hackage.haskell.org
209 http://launchpad.net
206 http://ftp.xemacs.org
197 https://github.com
196 http://ftp.xemacs.org/pub
196 http://ftp.xemacs.org/pub/xemacs
193 http://ftp.xemacs.org/pub/xemacs/packages
179 http://gstreamer.freedesktop.org
179 http://gstreamer.freedesktop.org/src
176 http://github.com
175 http://linuxgazette.net/ftpfiles
175 http://linuxgazette.net
139 http://xorg.freedesktop.org/releases/individual/app
130 http://pear.horde.org/get
130 http://pear.horde.org
126 http://xorg.freedesktop.org/releases/individual/driver
115 http://dev.gentoo.org/~vapier
112 http://dev.gentoo.org/~vapier/dist
96  ftp://sources.redhat.com/pub
96  ftp://sources.redhat.com
94  http://savannah.nongnu.org/download
94  http://savannah.nongnu.org
92  http://xorg.freedesktop.org/releases/individual/lib
88  http://pecl.php.net
88  http://download.gna.org
88  http://pecl.php.net/get
84  http://dev.gentoo.org/~caster/distfiles
84  http://dev.gentoo.org/~caster
80  https://fedorahosted.org
74  http://get.qt.nokia.com
72  http://components.ez.no/get
72  http://components.ez.no
71  http://get.qt.nokia.com/qt
70  http://get.qt.nokia.com/qt/source
67  http://www.phrack.org/archives
67  http://www.phrack.org/archives/tgz

And here are some mirrors that can be deduced from that (except freebsd-jp
which I found reading the linphone ebuild):

freebsd-jp  ftp://ports.jp.FreeBSD.org/pub/FreeBSD-jp/ 
ftp://ftp.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp1.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp2.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp3.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp4.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp5.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp6.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp7.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp8.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp9.jp.freebsd.org/pub/FreeBSD-jp/ 
ftp://ftp.ics.es.osaka-u.ac.jp/pub/FreeBSD-jp/ 
ftp://ftp.ics.es.osaka-u.ac.jp/pub/mirrors/FreeBSD-jp/ 
ftp://ftp.allbsd.org/pub/FreeBSD-jp/ ftp://ftp.nagaokaut.ac.jp/pub/FreeBSD-jp/ 
ftp://ftp.nara.wide.ad.jp/pub/FreeBSD-jp/ ftp://ftp.jaist.ac.jp/pub/FreeBSD-jp/
pearhttp://pear.php.net/ http://de.pear.php.net/ 
http://us.pear.php.net/
redhat  ftp://sources.redhat.com/pub/ 
ftp://mirrors.kernel.org/sources.redhat.com/ 
ftp://mirrors.kernel.org/sources.redhat.com/ 
http://mirrors.kernel.org/sources.redhat.com/ 
http://sources-redhat.mirrors.airband.net/ 
ftp://gd.tuwien.ac.at/gnu/sourceware/ http://gd.tuwien.ac.at/gnu/sourceware/ 
ftp://ftp.gwdg.de/pub/linux/sources.redhat.com/ 
http://ftp.gwdg.de/pub/linux/sources.redhat.com/ 
ftp://ftp-stud.fht-esslingen.de/pub/Mirrors/sourceware.org/ 
http://ftp-stud.fht-esslingen.de/pub/Mirrors/sourceware.org/ 
http://bo.mirror.garr.it/mirrors/sourceware.org/ 
ftp://bo.mirror.garr.it/mirrors/sourceware.org/ 
ftp://ftp.mirrorservice.org/sites/sources.redhat.com/pub/ 
http://www.mirrorservice.org/sites/sources.redhat.com/pub/
xemacs  http://ftp.xemacs.org/pub/ ftp://ftp.sa.xemacs.org/pub/
xorghttp://xorg.freedesktop.org/pub/ http://ftp.x.org/pub/

That would affect ~1000 URIs.
Unfortunatly I didn't found any mirror for:
- pecl.php.net
- hackage.haskell.org
- get.qt.nokia.com
- launchpad

But maybe we should still add them to thirdpartymirrors ? Look at 3dgamers, 
beyondunreal, bitbucket, github, gnome, kernel, xfce, vdrfiles and 
vdr-developerorg: they all only have one mirror !

For those afraid to add these mirrors and patch all affected ebuilds: no worry, 
mirrors.py can do it for you, and I already generated splitted patches 
generated from various thirdpartymirrors here: http://xf.iksaif.net/dev/gentoo/

Note that as part of https://bugs.gentoo.org/show_bug.cgi?id=405533 mgorny 
already commited a massive amount of patches adding mirror:// when the URI 
prefix was already present in thirdpartymirrors.

Thanks,

-- 
Corentin Chary
http://xf.iksaif.net/


pgp2V9f9HGBZb.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] New third party mirrors

2012-04-24 Thread Corentin Chary
On Tue, Apr 24, 2012 at 4:35 PM, Robin H. Johnson robb...@gentoo.org wrote:
 On Tue, Apr 24, 2012 at 03:10:00PM +0200, Corentin Chary wrote:
 Hello,

 As part of my portage-janitor scripts 
 (https://github.com/iksaif/portage-janitor)
 I created a tool that counts URIs prefixes and sort them.
 Here is the result of this script on the gentoo-x86 tree:

 $ ./mirrors.py --all --count
 297     http://pear.php.net
 297     http://pear.php.net/get
 88      http://pecl.php.net
 88      http://pecl.php.net/get
 These are already mirror bouncers. If you visit the above, you'll get
 the closest mirror for downloading.

And since there is already ~10 mirrors with only one actual backend,
should they go to thirdpartymirrors or not ? If not, what about this
pseudo-mirrors already present in thirdpartymirrors ?

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-21 Thread Corentin Chary
On Sat, Apr 21, 2012 at 03:33:18PM +1200, Kent Fredric wrote:
 On 21 April 2012 08:33, Corentin Chary corentin.ch...@gmail.com wrote:
  On Fri, Apr 20, 2012 at 9:35 PM, Kent Fredric kentfred...@gmail.com wrote:
  On 21 April 2012 01:34, Corentin Chary corentin.ch...@gmail.com wrote:
  Yeah, not very important, but seems to work with this patch:
  https://github.com/iksaif/portage-janitor/commit/972aff94744741e34e99f917337430d245883c48
 
 http://api.metacpan.org/release/Scalar-List-Utils
 
 Will be far better an option than parsing HTML =)
 
 And even better:
 
 curl -XPOST 'http://api.metacpan.org/module/_search' -d '
 {
   fields: [
   module.name,
   release
   ],
   query: {
   constant_score: {
   filter : {
   and : [
   { term: { 
 distribution:Scalar-List-Utils } },
   { term: { status:latest} },
   { term: { 
 mime:text/x-script.perl-module}},
   { term: { indexed:true}},
   { term: { module.authorized:true}}
   ]
   }
   }
   }
 }
 '
 

It is !

$ ./remoteids.py --diff WWW-Bugzilla Moose bioperl Scalar-List-Utils libwww-perl
--- a/dev-perl/WWW-Bugzilla/metadata.xml
+++ b/dev-perl/WWW-Bugzilla/metadata.xml
@@ -3,6 +3,8 @@
 pkgmetadata
   herdperl/herd
   upstream
+remote-id type=cpan-moduleWWW::Bugzilla::Search/remote-id
+remote-id type=cpan-moduleWWW::Bugzilla/remote-id
 remote-id type=cpanWWW-Bugzilla/remote-id
   /upstream
 /pkgmetadata
--- a/dev-perl/Moose/metadata.xml
+++ b/dev-perl/Moose/metadata.xml
@@ -3,6 +3,17 @@
 pkgmetadata
   herdperl/herd
   upstream
+remote-id type=cpan-moduleClass::MOP/remote-id
+remote-id 
type=cpan-moduleMoose::Meta::Attribute::Native::Trait::Counter/remote-id
+remote-id 
type=cpan-moduleMoose::Meta::Attribute::Native::Trait::String/remote-id
+remote-id type=cpan-moduleMoose::Meta::Method::Delegation/remote-id
+remote-id type=cpan-moduleMoose::Meta::TypeConstraint::Role/remote-id
+remote-id 
type=cpan-moduleMoose::Meta::Attribute::Custom::Moose/remote-id
+remote-id type=cpan-moduleMoose::Meta::Attribute/remote-id
+remote-id type=cpan-moduleMoose::Meta::Method/remote-id
+remote-id type=cpan-moduleMoose::Error::Croak/remote-id
+remote-id type=cpan-moduleMoose::Util::MetaRole/remote-id
+remote-id type=cpan-moduleMoose::Role/remote-id
 remote-id type=cpanMoose/remote-id
   /upstream
 /pkgmetadata
--- a/perl-core/Scalar-List-Utils/metadata.xml
+++ b/perl-core/Scalar-List-Utils/metadata.xml
@@ -3,6 +3,9 @@
 pkgmetadata
   herdperl/herd
   upstream
+remote-id type=cpan-moduleList::Util/remote-id
+remote-id type=cpan-moduleScalar::Util/remote-id
+remote-id type=cpan-moduleList::Util::XS/remote-id
 remote-id type=cpanScalar-List-Utils/remote-id
   /upstream
 /pkgmetadata
--- a/dev-perl/libwww-perl/metadata.xml
+++ b/dev-perl/libwww-perl/metadata.xml
@@ -3,6 +3,17 @@
 pkgmetadata
   herdperl/herd
   upstream
+remote-id type=cpan-moduleLWP::Protocol::data/remote-id
+remote-id type=cpan-moduleLWP::Protocol::gopher/remote-id
+remote-id type=cpan-moduleLWP::Debug/remote-id
+remote-id type=cpan-moduleLWP::UserAgent/remote-id
+remote-id type=cpan-moduleLWP::Authen::Digest/remote-id
+remote-id type=cpan-moduleLWP::Protocol::file/remote-id
+remote-id type=cpan-moduleLWP::Protocol::loopback/remote-id
+remote-id type=cpan-moduleLWP::Protocol::MyFTP/remote-id
+remote-id type=cpan-moduleLWP::Protocol::ftp/remote-id
+remote-id type=cpan-moduleLWP::Protocol::cpan/remote-id
+remote-id type=cpan-moduleLWP::MemberMixin/remote-id
 remote-id type=cpanlibwww-perl/remote-id
   /upstream
 /pkgmetadata


-- 
Corentin Chary
http://xf.iksaif.net/



Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-21 Thread Corentin Chary
On Sat, Apr 21, 2012 at 11:32:26AM +0200, Michał Górny wrote:
 On Thu, 19 Apr 2012 21:32:49 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:
 
  On Thu, Apr 19, 2012 at 6:54 PM, Michał Górny mgo...@gentoo.org
  wrote:
   On Thu, 19 Apr 2012 17:31:11 +0200
   Corentin Chary corentin.ch...@gmail.com wrote:
  
   -      !ATTLIST remote-id type
   (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran)
   #REQUIRED
   +      !ATTLIST remote-id type
   (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket)
   #REQUIRED
  
   Wouldn't it be better if we kept them sorted?
  
  There were not, so I just appended them, but I can provide another
  patch sorting them too if preferred.
 
 I'll appreciate if you could do that, along with that cpan-module
 addition, and if noone objects, I'll just commit that.
 
 -- 
 Best regards,
 Michał Górny


--- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.0 +0100
+++ b/metadata/dtd/metadata.dtd 2012-04-21 13:09:52.557223402 +0200
@@ -61,7 +61,7 @@
 !ELEMENT bugs-to (#PCDATA)
 !-- specify a type of package identification tracker --
 !ELEMENT remote-id (#PCDATA)
-  !ATTLIST remote-id type 
(freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran)
 #REQUIRED
+  !ATTLIST remote-id type 
(bitbucket|cpan|cpan-module|cran|ctan|freshmeat|github|gitorious|google-code|launchpad|pear|pecl|pypi|rubyforge|rubygems|sourceforge|sourceforge-jp|vim)
 #REQUIRED
 
   !-- category/package information for cross-linking in descriptions
 and useflag descriptions --


-- 
Corentin Chary
http://xf.iksaif.net/


pgp65RpvYD2Le.pgp
Description: PGP signature


Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-20 Thread Corentin Chary
On Fri, Apr 20, 2012 at 9:37 AM, Kent Fredric kentfred...@gmail.com wrote:
 On 20 April 2012 03:31, Corentin Chary corentin.ch...@gmail.com wrote:
 Add rubygems, github, gitorious, pecl, pear, bitbucket.
 All of them are handled by my remoteids.py script.

 ref: https://bugs.gentoo.org/show_bug.cgi?id=406287
 ref: https://github.com/iksaif/portage-janitor/blob/master/remoteids.py

 --- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.0 +0100
 +++ b/metadata/dtd/metadata.dtd 2012-04-19 14:22:14.077954310 +0200
 @@ -61,7 +61,7 @@
     !ELEMENT bugs-to (#PCDATA)
     !-- specify a type of package identification tracker --
     !ELEMENT remote-id (#PCDATA)
 -      !ATTLIST remote-id type 
 (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran)
  #REQUIRED
 +      !ATTLIST remote-id type 
 (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket)
  #REQUIRED

   !-- category/package information for cross-linking in descriptions
     and useflag descriptions --

 --
 Corentin Chary
 http://xf.iksaif.net/


 I suggested last week on #gentoo-perl that it might be nice to have
 'cpan' and 'cpan-module'  ( or something like that ) to disambiguate 2
 queryable terms. ( where 'cpan'  = 'the package name on cpan' )

 For some purposes, its most convenient to use the distribution name,
 and for other purposes, (ie: cpan clients) its more convenient to use
 a Module name, and its not easy to translate between the two, as
 Module names sometimes switch between packages  they're shipped in.

 For instance, a while ago, the BioPerl module was shipped in a
 distribution 'bioperl' , which has only recently been changed to
 BioPerl


 http://api.metacpan.org/release/_search?q=distribution:bioperlfields=archive,author,date,download_url

 http://api.metacpan.org/release/_search?q=distribution:BioPerlfields=archive,author,date,download_url

 vs


 http://api.metacpan.org/module/_search?q=module.name:Bio\:\:Perlfields=distribution,author,release

Looks sane since the goal of remote-id is being able to identify the
package upstream.
Do you think you could patch remotesid.py to generate tags for cpan /
cpan-modules ? Or at least give me a pseudo-algo that does the trick.
Thanks :)

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-20 Thread Corentin Chary
On Fri, Apr 20, 2012 at 10:26 AM, Kent Fredric kentfred...@gmail.com wrote:
 On 20 April 2012 19:46, Corentin Chary corentin.ch...@gmail.com wrote:
 On Fri, Apr 20, 2012 at 9:37 AM, Kent Fredric kentfred...@gmail.com wrote:
 On 20 April 2012 03:31, Corentin Chary corentin.ch...@gmail.com wrote:
 Add rubygems, github, gitorious, pecl, pear, bitbucket.
 All of them are handled by my remoteids.py script.

 ref: https://bugs.gentoo.org/show_bug.cgi?id=406287
 ref: https://github.com/iksaif/portage-janitor/blob/master/remoteids.py

 --- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.0 +0100
 +++ b/metadata/dtd/metadata.dtd 2012-04-19 14:22:14.077954310 +0200
 @@ -61,7 +61,7 @@
     !ELEMENT bugs-to (#PCDATA)
     !-- specify a type of package identification tracker --
     !ELEMENT remote-id (#PCDATA)
 -      !ATTLIST remote-id type 
 (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran)
  #REQUIRED
 +      !ATTLIST remote-id type 
 (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket)
  #REQUIRED

   !-- category/package information for cross-linking in descriptions
     and useflag descriptions --

 --
 Corentin Chary
 http://xf.iksaif.net/


 I suggested last week on #gentoo-perl that it might be nice to have
 'cpan' and 'cpan-module'  ( or something like that ) to disambiguate 2
 queryable terms. ( where 'cpan'  = 'the package name on cpan' )

 For some purposes, its most convenient to use the distribution name,
 and for other purposes, (ie: cpan clients) its more convenient to use
 a Module name, and its not easy to translate between the two, as
 Module names sometimes switch between packages  they're shipped in.

 For instance, a while ago, the BioPerl module was shipped in a
 distribution 'bioperl' , which has only recently been changed to
 BioPerl


 http://api.metacpan.org/release/_search?q=distribution:bioperlfields=archive,author,date,download_url

 http://api.metacpan.org/release/_search?q=distribution:BioPerlfields=archive,author,date,download_url

 vs


 http://api.metacpan.org/module/_search?q=module.name:Bio\:\:Perlfields=distribution,author,release

 Looks sane since the goal of remote-id is being able to identify the
 package upstream.
 Do you think you could patch remotesid.py to generate tags for cpan /
 cpan-modules ? Or at least give me a pseudo-algo that does the trick.
 Thanks :)

 --
 Corentin Chary
 http://xf.iksaif.net



 That is sadly not straight forward.  Extracting the package name can
 be straight forward if you have the URL, because the package name is
 literally the same as the archive name in SRC_URI , sans version
 information.

 However, if you look at many perl ebuilds, you'll notice many lack
 this field and we've got other things in place, so the current parsing
 technique you use to detect uses of SRC_URI wont work there ( I could
 be wrong, I don't fully grok your python code )

Currently it uses SRC_URI and HOMEPAGE, but honestly it wouldn't be
hard to use any other environment variable and to do some checks on a
webservice.
Anyway for tricky cases it can still be done by hand.

 And more-over, determining the value of 'cpan-module' may be
 impossible without access to the tar.gz itself, or querying the
 MetaCPAN API.

 Usually, upstream are sensible and have package names which closely
 correspond with the module names, ie: Dist::Zilla is shipped in
 'Dist-Zilla-$VERSION.tar.gz',  but there are many packages which dont
 do this, such as this notable example:
 https://metacpan.org/release/Scalar-List-Utils  , which has no modules
 corresponding to the package name, and no way to divine the/a 'main'
 module from the package itself. ( and this is exacerbated by packages
 changing names, or package joins ( 2 packages becoming 1 via releasing
 modules together ),  and package splits ( 1 package rips into 2 sets
 of modules ).

 Essentially, using a cpan-module as an identifier is somewhat
 forwards only , and even then, what it will resolve to is governed
 by time.

 This is fine for CPAN clients, which do the resolution hot, using the
 whole of CPAN as their data, if a user asks for Foo::Bar, their cpan
 client will ask a cpan server ( or regularly (hourly) updated list )
 as to what package that module can be found in ( and this only returns
 the most recent package, so name changes and so-forth are invisible to
 the user ).

 And being helpful to CPAN clients is one of the reasons we want this
 value as a specifiable option in the first place. For us, its easier
 to track the package name, and then when that has to change we can
 manually resolve the issue

 --
 Kent

 perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
 3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

 http://kent-fredric.fox.geek.nz




-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-20 Thread Corentin Chary
On Sat, Apr 21, 2012 at 12:39:26AM +1200, Kent Fredric wrote:
 If you really want to support Perl Modules, ( which theres not much
 need for at present, looks like the team have gone through already for
 the most part and added remote-id's where possible already ), anything
 that inherits 'perl-module.eclass' has a bit of magic, in that neither
 SRC_URI or HOMEPAGE is required in the ebuild, and it just gets the
 package name from what gentoo is using.  We've tried to be as close to
 upstream as possible for the ease of maintenance.

Actually the eclass seems to fill SRC_URI and HOMEPAGE, so when you source
the ebuild they are available.

 However, there are still exception cases, for instance, BioPerl has to
 define 'MY_PN' to tell the perl-module eclass to use a different token
 ( and when this is present, it should be sufficient to say that that
 should be the remote-id instead of the package name:
 see dev-perl/Moose  # an example with neither src_uri or homepage
 see sci-biology/bioperl # an example where the package name has
 been forced overridden as its changed upstream
 
 But resolving module names is much trickier, its easy-ish to map a
 module name to a package using the service, but doing it the other way
 round is not so straight forward, as one package can have many
 modules, and its common in perl to state dependencies in terms of the
 module to require, not the package its in, but there's also often a
 defacto main module.
 
 But I'm myself still working out how to best do that with the service
 , so auto-populating a cpan-module identifier can be left to later,
 its just something I considered useful to have metadata wise because
 that value is more useful to users.

Yeah, not very important, but seems to work with this patch:
https://github.com/iksaif/portage-janitor/commit/972aff94744741e34e99f917337430d245883c48

Example:
$ python remoteids.py --diff WWW-Bugzilla Moose bioperl
--- a/dev-perl/WWW-Bugzilla/metadata.xml
+++ b/dev-perl/WWW-Bugzilla/metadata.xml
@@ -3,6 +3,7 @@
 pkgmetadata
   herdperl/herd
   upstream
+remote-id type=cpan-moduleWWW::Bugzilla/remote-id
 remote-id type=cpanWWW-Bugzilla/remote-id
   /upstream
 /pkgmetadata
--- a/dev-perl/Moose/metadata.xml
+++ b/dev-perl/Moose/metadata.xml
@@ -3,6 +3,7 @@
 pkgmetadata
   herdperl/herd
   upstream
+remote-id type=cpan-moduleMoose/remote-id
 remote-id type=cpanMoose/remote-id
   /upstream
 /pkgmetadata
--- a/sci-biology/bioperl/metadata.xml
+++ b/sci-biology/bioperl/metadata.xml
@@ -8,6 +8,7 @@ 
flag name=dbInstall sci-biology/bioperl-run/flag
/use
upstream
+   remote-id type=cpan-moduleBioPerl/remote-id
remote-id type=cpanBioPerl/remote-id
/upstream
 /pkgmetadata

-- 
Corentin Chary
http://xf.iksaif.net/



Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-20 Thread Corentin Chary
On Fri, Apr 20, 2012 at 9:35 PM, Kent Fredric kentfred...@gmail.com wrote:
 On 21 April 2012 01:34, Corentin Chary corentin.ch...@gmail.com wrote:
 Yeah, not very important, but seems to work with this patch:
 https://github.com/iksaif/portage-janitor/commit/972aff94744741e34e99f917337430d245883c48

 Example:
 $ python remoteids.py --diff WWW-Bugzilla Moose bioperl
 --- a/dev-perl/WWW-Bugzilla/metadata.xml
 +++ b/dev-perl/WWW-Bugzilla/metadata.xml
 @@ -3,6 +3,7 @@
  pkgmetadata
 ...
 --
 Corentin Chary
 http://xf.iksaif.net/


 What does it to for say, Scalar-List-Utils ?  or libwww-perl ?

Nothing, it's one of the cases you'll have to handle by-hand.
Another option is to parse the HTML from
http://search.cpan.org/dist/Scalar-List-Utils/ but I'm not a fan of
HTML-parsing.

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-19 Thread Corentin Chary
Add rubygems, github, gitorious, pecl, pear, bitbucket.
All of them are handled by my remoteids.py script.

ref: https://bugs.gentoo.org/show_bug.cgi?id=406287
ref: https://github.com/iksaif/portage-janitor/blob/master/remoteids.py

--- a/metadata/dtd/metadata.dtd 2010-03-02 18:52:11.0 +0100
+++ b/metadata/dtd/metadata.dtd 2012-04-19 14:22:14.077954310 +0200
@@ -61,7 +61,7 @@
 !ELEMENT bugs-to (#PCDATA)
 !-- specify a type of package identification tracker --
 !ELEMENT remote-id (#PCDATA)
-  !ATTLIST remote-id type 
(freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran)
 #REQUIRED
+  !ATTLIST remote-id type 
(freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket)
 #REQUIRED
 
   !-- category/package information for cross-linking in descriptions
 and useflag descriptions --

-- 
Corentin Chary
http://xf.iksaif.net/


pgplrUeuh7QHW.pgp
Description: PGP signature


Re: [gentoo-dev] RFC: Add new remote-id types in metadata.dtd

2012-04-19 Thread Corentin Chary
On Thu, Apr 19, 2012 at 6:54 PM, Michał Górny mgo...@gentoo.org wrote:
 On Thu, 19 Apr 2012 17:31:11 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 -      !ATTLIST remote-id type
 (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran)
 #REQUIRED
 +      !ATTLIST remote-id type
 (freshmeat|sourceforge|sourceforge-jp|cpan|vim|google-code|ctan|pypi|rubyforge|cran|rubygems|github|gitorious|pecl|pear|bitbucket)
 #REQUIRED

 Wouldn't it be better if we kept them sorted?

There were not, so I just appended them, but I can provide another
patch sorting them too if preferred.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] thirdpartymirrors URL for bitbucket

2012-04-11 Thread Corentin Chary
On Wed, Apr 11, 2012 at 12:44 PM, Michał Górny mgo...@gentoo.org wrote:
 Hello all,

 Similarly to github, bitbucket does enforce SSL by default, and
 downloads are redirected to another, non-https URI. Thus, I'd like to
 add the following thirdpartymirrors entry:

 bitbucket       http://cdn.bitbucket.org

 The path part of URI is consistent with the usual https://bitbucket.org
 one, so if the URI stops working, we can replace it with the 'official'
 one.

 Any comments?

Note that you could use my mirror.py (with some tweaks) script to
generate a patch that makes ebuilds use this mirror.

- https://github.com/iksaif/portage-janitor (portage-janitor scripts,
including mirrors.py and others)
- https://bugs.gentoo.org/show_bug.cgi?id=405533 (Fix mirrors in
multiple ebuilds)

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] New eclass for Python

2012-04-04 Thread Corentin Chary
On Mon, Mar 26, 2012 at 6:23 PM, Krzysztof Pawlik nelch...@gentoo.org wrote:
 On 26/03/12 18:11, Krzysztof Pawlik wrote:
 On 26/03/12 09:20, justin wrote:
 On 25/03/12 20:56, Krzysztof Pawlik wrote:
 On 28/02/12 22:13, Krzysztof Pawlik wrote:
 If there are no objections then during the weekend (March 3, 4) I will 
 add this
 to portage (after finishing remaining TODO items, PyPy requires 4G of 
 RAM(!!)).

 Hello,

 Slightly late due to Real Life™ but finally it's in the main tree :)

 (and yes - I've tested it with pypy - works as expected :)


 Hi,

 is there any documentation beside the man page somewhere?

 No.

 I tried to port some ebuilds but as soon I set

 PYTHON_COMPAT=python2_7 python2_6 python2_5 pypy1_8

 inherit python-distutils-ng

 I get

   REQUIRED_USE: USE flag 'python_targets_python3_1' is not in IUSE

 Did I do something wrong, or is there something not straight in the eclass?

 Can you send me the whole ebuild off-list?

 There are two ebuilds using the eclass that I've used as tests:
 http://git.overlays.gentoo.org/gitweb/?p=dev/nelchael.git;a=tree;f=dev-python;h=f1a8e00e3e6df33806d8972c8898f1187163bd3d;hb=HEAD

 Ok, found a bug: REQUIRED_USE can't contain elements not in USE, so if you
 excluded python3_1 from PYTHON_COMPAT it didn't appear in IUSE too -
 REQUIRED_USE contained invalid value. Fixed by below patch:

 nelchael@s-lappy ~/.../gentoo-x86/eclass$ cvs diff
 Index: python-distutils-ng.eclass
 ===
 RCS file: /var/cvsroot/gentoo-x86/eclass/python-distutils-ng.eclass,v
 retrieving revision 1.2
 diff -u -r1.2 python-distutils-ng.eclass
 --- python-distutils-ng.eclass  26 Mar 2012 06:12:53 -      1.2
 +++ python-distutils-ng.eclass  26 Mar 2012 16:20:52 -
 @@ -105,11 +105,11 @@
        esac
  }

 -required_use_str= || (
 -       python_targets_python2_5 python_targets_python2_6 
 python_targets_python2_7
 -       python_targets_python3_1 python_targets_python3_2
 -       python_targets_jython2_5
 -       python_targets_pypy1_7 python_targets_pypy1_8 )
 +required_use_str=
 +for impl in ${PYTHON_COMPAT}; do
 +       required_use_str=${required_use_str} python_targets_${impl}
 +done
 +required_use_str= || ( ${required_use_str} )
  if [[ ${PYTHON_OPTIONAL} = yes ]]; then
        IUSE+=python
        REQUIRED_USE+= python? ( ${required_use_str} )


 --
 Krzysztof Pawlik  nelchael at gentoo.org  key id: 0xF6A80E46
 desktop-misc, java, vim, kernel, python, apache...


I have a feature request for distutil-ng (or maybe it's already
possible but I don't know how).

I have a package that depends on python-dateutil:python-2 for
python2_x and python-dateutil:python-3 for python3_x.
Would it be possible to have virtual targets like python, python2,
python3, pypi, jithon ?

Thanks,


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] New eclass for Python

2012-04-04 Thread Corentin Chary
On Wed, Apr 4, 2012 at 4:22 PM, Mike Gilbert flop...@gentoo.org wrote:
 On Wed, Apr 4, 2012 at 4:50 AM, Corentin Chary corentin.ch...@gmail.com 
 wrote:
 I have a package that depends on python-dateutil:python-2 for
 python2_x and python-dateutil:python-3 for python3_x.
 Would it be possible to have virtual targets like python, python2,
 python3, pypi, jithon ?


 With regards to python-dateutil: As of python-dateutil-2.1, there are
 no longer separate slots for python-2 and python-3. As well, I masked
 the only version (2.0) with SLOT=python-3.

 For future compatibility, you should remove the slot from your
 dependencies and just depend on dev-python/python-dateutil.


Yep, I just saw that. But well, celery is full of examples like thats
since it have different dependencies for python2, python2.5, python2.5
+ tests, jython, jython + tests, etc..
Having a way to group similar python would be great.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] lurking *.ebuild'less packages

2012-03-16 Thread Corentin Chary
On Thu, Mar 15, 2012 at 10:42 PM, Matt Turner matts...@gentoo.org wrote:
 On Thu, Mar 15, 2012 at 4:59 PM, Sergei Trofimovich sly...@gentoo.org wrote:
 slep noticed and reported an odd thing:

 $ euse -i kate
 ...
 ls: cannot access 
 /gentoo/portage/metadata/cache/kde-base/kdebindings-perl-*: No such file or 
 directory
 ls: cannot access 
 /gentoo/portage/metadata/cache/kde-base/kdebindings-ruby-*: No such file or 
 directory
 ...

 The dirs they don't contain ebuilds. Only metadata.
 KDE team is aware and fixing their orphans.

 I've decided to write a hack to find such instances in tree:

 $ ./find_empty.sh ~/portage/gentoo-x86/
 kde-base/kdeaccessibility-colorschemes
 kde-base/kdeaccessibility-iconthemes
 kde-base/kdebase-wallpapers
 kde-base/kdebindings-csharp
 kde-base/kdebindings-perl
 kde-base/kdebindings-ruby
 kde-base/kvtml-data
 kde-base/smoke
 media-plugins/mytharchive
 media-plugins/mythbrowser
 media-plugins/mythgallery
 media-plugins/mythgame
 media-plugins/mythmovies
 media-plugins/mythmusic
 media-plugins/mythnetvision
 media-plugins/mythnews
 media-plugins/mythweather
 x11-themes/mythtv-themes

 Is there any reason to leave such ebuildless directories?

 Thanks!

 --

  Sergei

 Let's get this script added to http://qa-reports.gentoo.org/


Speaking about qa-reports, do you think SRC_URI not using mirror://
when they should and metadata.xml not containing remote-id when they
should belong there ?
If yes, please take a look at:
- https://github.com/iksaif/portage-janitor
- https://bugs.gentoo.org/show_bug.cgi?id=405533
- https://bugs.gentoo.org/show_bug.cgi?id=406287

Thanks,
-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Gentoo Janitor scripts

2012-02-29 Thread Corentin Chary
On Mon, Feb 27, 2012 at 4:10 PM, Paweł Hajdan, Jr.
phajdan...@gentoo.org wrote:
 On 2/20/12 6:03 PM, Corentin Chary wrote:
 Since I plan to use the remote remote-id tag for euscan, and I already
 use SRC_URI but I'd like all ebuild to use mirrors, I've wrote to
 scripts to cleanup your ebuilds and metadata.
 There are available here: https://github.com/iksaif/portage-janitor
 Here is what you can do with them:

 python remoteids.py --diff pycuda Test-Tester Alien-SDL ostinato
 --- a/dev-python/pycuda/metadata.xml
 +++ b/dev-python/pycuda/metadata.xml
 @@ -4,4 +4,7 @@
         maintainer
                 emailsp...@gentoo.org/email
         /maintainer
 +        upstream
 +                remote-id type=pypipycuda/remote-id
 +        /upstream
  /pkgmetadata

 Maybe some bits could be integrated to repoman...

 I second that, those remoteids.py changes LGTM (look good to me).

 As always, I second any effort to make those useful things part of
 official Gentoo.


Fix remote ids bug: https://bugs.gentoo.org/show_bug.cgi?id=406287
Fix mirrors bug: https://bugs.gentoo.org/show_bug.cgi?id=405533

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] Re: Gentoo Janitor scripts

2012-02-22 Thread Corentin Chary
I did a quick script to count most used prefixes in SRC_URI yesterday
(https://github.com/iksaif/portage-janitor/blob/master/mirrors.py)

Here is the (filtered) result:

$ eix --only-names | python mirrors.py --count
960 http://dev.gentoo.org
372 http://xorg.freedesktop.org
372 http://xorg.freedesktop.org/releases
372 http://xorg.freedesktop.org/releases/individual
306 http://pear.php.net
306 http://pear.php.net/get
256 http://oss.tresys.com
255 http://oss.tresys.com/files
255 http://oss.tresys.com/files/refpolicy
225 http://hackage.haskell.org/packages
225 http://hackage.haskell.org/packages/archive
225 http://hackage.haskell.org
206 http://ftp.xemacs.org
201 https://github.com
196 http://ftp.xemacs.org/pub
196 http://ftp.xemacs.org/pub/xemacs
193 http://ftp.xemacs.org/pub/xemacs/packages
181 http://gstreamer.freedesktop.org
181 http://gstreamer.freedesktop.org/src
175 http://launchpad.net
175 http://linuxgazette.net
143 http://github.com
130 http://pear.horde.org
130 http://pear.horde.org/get
101 http://savannah.nongnu.org/download
101 http://savannah.nongnu.org
100 http://get.qt.nokia.com
97  ftp://sources.redhat.com/pub
97  ftp://sources.redhat.com
96  http://get.qt.nokia.com/qt
95  http://get.qt.nokia.com/qt/source
90  http://download.gna.org
75  http://pecl.php.net
75  http://pecl.php.net/get
72  http://components.ez.no/get
72  http://components.ez.no
69  https://fedorahosted.org
67  http://www.phrack.org/archives
67  http://www.phrack.org/archives/tgz
67  http://www.phrack.org


From that output we can easilly find out new entries to
thirdpartymirrors, for example:
gentoo-devhttp://dev.gentoo.org
xorg http://xorg.freedesktop.org
gna  http://download.gna.org
pecl http://pecl.php.net
pear http://pear.php.net
github  https://github.com http://github.com
xemacs   http://ftp.xemacs.org/pub/ ftp://ftp.sa.xemacs.org/pub/
launchpadhttp://launchpad.net
redhat ftp://sources.redhat.com/pub/ (and probably others !)
etc...

The good part is that once you've modified thirdpartymirrors with new
mirrors, running mirrors.py --all will generate a big patch for all
your ebuilds to use those new mirrors !

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: Gentoo Janitor scripts

2012-02-22 Thread Corentin Chary
On Wed, Feb 22, 2012 at 1:20 PM, Mart Raudsepp l...@gentoo.org wrote:
 On K, 2012-02-22 at 09:48 +0100, Corentin Chary wrote:
 I did a quick script to count most used prefixes in SRC_URI yesterday
 (https://github.com/iksaif/portage-janitor/blob/master/mirrors.py)

 Here is the (filtered) result:

 $ eix --only-names | python mirrors.py --count
 960     http://dev.gentoo.org
 372     http://xorg.freedesktop.org
 372     http://xorg.freedesktop.org/releases
 372     http://xorg.freedesktop.org/releases/individual
 306     http://pear.php.net
 306     http://pear.php.net/get
 256     http://oss.tresys.com
 255     http://oss.tresys.com/files
 255     http://oss.tresys.com/files/refpolicy
 225     http://hackage.haskell.org/packages
 225     http://hackage.haskell.org/packages/archive
 225     http://hackage.haskell.org
 206     http://ftp.xemacs.org
 201     https://github.com
 196     http://ftp.xemacs.org/pub
 196     http://ftp.xemacs.org/pub/xemacs
 193     http://ftp.xemacs.org/pub/xemacs/packages
 181     http://gstreamer.freedesktop.org
 181     http://gstreamer.freedesktop.org/src
 175     http://launchpad.net
 175     http://linuxgazette.net
 143     http://github.com
 130     http://pear.horde.org
 130     http://pear.horde.org/get
 101     http://savannah.nongnu.org/download
 101     http://savannah.nongnu.org
 100     http://get.qt.nokia.com
 97      ftp://sources.redhat.com/pub
 97      ftp://sources.redhat.com
 96      http://get.qt.nokia.com/qt
 95      http://get.qt.nokia.com/qt/source
 90      http://download.gna.org
 75      http://pecl.php.net
 75      http://pecl.php.net/get
 72      http://components.ez.no/get
 72      http://components.ez.no
 69      https://fedorahosted.org
 67      http://www.phrack.org/archives
 67      http://www.phrack.org/archives/tgz
 67      http://www.phrack.org


 From that output we can easilly find out new entries to
 thirdpartymirrors, for example:
 gentoo-dev    http://dev.gentoo.org
 xorg             http://xorg.freedesktop.org
 gna              http://download.gna.org
 pecl             http://pecl.php.net
 pear             http://pear.php.net
 github          https://github.com http://github.com
 xemacs       http://ftp.xemacs.org/pub/ ftp://ftp.sa.xemacs.org/pub/
 launchpad    http://launchpad.net
 redhat         ftp://sources.redhat.com/pub/ (and probably others !)
 etc...

 The good part is that once you've modified thirdpartymirrors with new
 mirrors, running mirrors.py --all will generate a big patch for all
 your ebuilds to use those new mirrors !

 If you want this, then you should better figure out actual upstream
 mirroring systems and their list of mirrors they would want us to use.
 Until such, this seems to be just for shortening SRC_URI addresses when
 an upstream tarball domain name or path repeats, and that's definitely
 not what thirdpartymirrors is for.

Yes, of course, that was just a quick example, not something definitive.

But lets some examples:
- http://xorg.freedesktop.org, it's easy to find mirror for that one,
http://ftp.x.org/pub/ for example
- github: packages seems to use http and https, this script can help
to standardize the url used
-  960     http://dev.gentoo.org: that's a lot of package hosted
here, is that really right ?

And still, thirdpartymirrors have some entries with only one mirror
and I believe factorizing SRC_URIs is a good thing (if something
changes, you just patch thirdpartymirros, not hundreds of ebuilds).

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: Gentoo Janitor scripts

2012-02-22 Thread Corentin Chary
On Wed, Feb 22, 2012 at 7:03 PM, Markos Chandras hwoar...@gentoo.org wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA512

 On 02/22/2012 01:33 PM, Ben wrote:
 On 22 February 2012 20:36, Corentin Chary
 corentin.ch...@gmail.com wrote:
 -  960     http://dev.gentoo.org: that's a lot of package
 hosted here, is that really right ?

 That includes patches 20kb

 Gentoo devs are supposed to put patches, tarballs and whatever they
 want in their space. This is the recommended policy documented in
 devmanual as well

 http://devmanual.gentoo.org/general-concepts/mirrors/index.html

Yes, you're right, I totally forgot about patches. I'll try to make
the script smarter and skip those.

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] Gentoo Janitor scripts

2012-02-20 Thread Corentin Chary
Hi,
Since I plan to use the remote remote-id tag for euscan, and I already
use SRC_URI but I'd like all ebuild to use mirrors, I've wrote to
scripts to cleanup your ebuilds and metadata.
There are available here: https://github.com/iksaif/portage-janitor
Here is what you can do with them:

python remoteids.py --diff pycuda Test-Tester Alien-SDL ostinato
--- a/dev-python/pycuda/metadata.xml
+++ b/dev-python/pycuda/metadata.xml
@@ -4,4 +4,7 @@
maintainer
emailsp...@gentoo.org/email
/maintainer
+upstream
+remote-id type=pypipycuda/remote-id
+/upstream
 /pkgmetadata
--- a/dev-perl/Alien-SDL/metadata.xml
+++ b/dev-perl/Alien-SDL/metadata.xml
@@ -7,4 +7,7 @@
 emailssuomi...@gentoo.org/email
 nameSamuli Suominen/name
   /maintainer
+  upstream
+remote-id type=cpanAlien-SDL/remote-id
+  /upstream
 /pkgmetadata
--- a/net-analyzer/ostinato/metadata.xml
+++ b/net-analyzer/ostinato/metadata.xml
@@ -7,5 +7,7 @@
/maintainer
longdescription lang=en
/longdescription
+upstream
+remote-id type=google-codeostinato/remote-id
+/upstream
 /pkgmetadata


$ eix -C dev-python --only-names | python mirrors.py --diff
--- a/dev-python/asciitable/asciitable-0.8.0.ebuild
+++ b/dev-python/asciitable/asciitable-0.8.0.ebuild
@@ -11,7 +11,7 @@

 DESCRIPTION=An extensible ASCII table reader
 HOMEPAGE=http://pypi.python.org/pypi/asciitable
http://cxc.harvard.edu/contrib/asciitable;
-SRC_URI=http://pypi.python.org/packages/source/a/${PN}/${P}.tar.gz;
+SRC_URI=mirror://pypi/a/${PN}/${P}.tar.gz

 LICENSE=GPL-2
 SLOT=0
--- a/dev-python/cosmolopy/cosmolopy-0.1.102.ebuild
+++ b/dev-python/cosmolopy/cosmolopy-0.1.102.ebuild
@@ -15,7 +15,7 @@

 DESCRIPTION=Cosmology routines built on NumPy/SciPy
 HOMEPAGE=http://roban.github.com/CosmoloPy/
http://pypi.python.org/pypi/CosmoloPy;
-SRC_URI=http://pypi.python.org/packages/source/C/${MY_PN}/${MY_P}.tar.gz;
+SRC_URI=mirror://pypi/C/${MY_PN}/${MY_P}.tar.gz

 LICENSE=MIT
 SLOT=0


Feel free to test them, and if they are broken I'll gladly accept a patch :).

Maybe some bits could be integrated to repoman...

Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] RFC: upstream/watch in metadata.xml

2012-02-13 Thread Corentin Chary
As some may know, I'm working on euscan, and currently euscan only use
CPV and SRC_URI to find new upstream versions.
This works well if upstream url and version scheme is sane or if
upstream has an API for that (rubygem, pypi, pecl, pear), but it's far
from optimal.

Debian use a specific file for that: debian/watch and it looks like
that (for media-plugins/vdr-softdevice):

opts=downloadurlmangle=s/prdownload/download/ \
   http://developer.berlios.de/project/showfiles.php?group_id=2051 \
   http://prdownload.berlios.de/softdevice/vdr-softdevice-(.+).tgz

opts specify some options to mangle the final url, and then there is a
list of url to scan. man uscan for more informations.

Currently, if you run euscan on this package, it doesn't work at all:
http://euscan.iksaif.net/package/media-plugins/vdr-softdevice/
1/ it's hosted on gentoo mirrors, and scanning them takes too long
because all files are in the same directory
2/ the url doesn't contain the version

So, to help euscan (and other tools) for some package, I think we
could introduce some hints in metadata.xml. This would extend the
existing upstream element:

upstream
version-scan 
downloadurlmangle=s/prdownload/download/http://developer.berlios.de/project/showfiles.php?group_id=2051
\
   
http://prdownload.berlios.de/softdevice/vdr-softdevice-(.+).tgz/version-scan
/upstream

The format is not defined yet, but it would probably look like
debian/watch, that would allow to write a script to import (valid)
debian/watch files into associated metadata.xml when needed.

One other thing, metadata.xml already contain a remote-id tag, which
would be very great to help euscan do its job, but a lot of package
are lacking it:
- Should we patch repoman to scan SRC_URI and issue a warning when it
looks like an URI that match a well known remote-id
- Should we write a script to update metadata.xml ? It would be easy
for rubygem, pypi and pear packages.

Any comment ? Objections ? Ideas ?

Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] RFC: upstream/watch in metadata.xml

2012-02-13 Thread Corentin Chary
On Mon, Feb 13, 2012 at 10:50 AM, Dirkjan Ochtman d...@gentoo.org wrote:
 On Mon, Feb 13, 2012 at 10:33, Corentin Chary corentin.ch...@gmail.com 
 wrote:
 One other thing, metadata.xml already contain a remote-id tag, which
 would be very great to help euscan do its job, but a lot of package
 are lacking it:
 - Should we patch repoman to scan SRC_URI and issue a warning when it
 looks like an URI that match a well known remote-id
 - Should we write a script to update metadata.xml ? It would be easy
 for rubygem, pypi and pear packages.

 Any comment ? Objections ? Ideas ?

 I like the idea for keeping the data somewhere for known-insane cases,
 and metadata.xml sounds like it might be fine. But I don't think we
 should add anything for the likes of PyPI, if we can easily derive
 that we should look on PyPI some other way (i.e. for python, many
 packages list a PyPI page in their HOMEPAGE).

For pypi (and some others), looking at SRC_URI is enought: it starts
with mirror://pypi/.
Still for those upstreamremote-id *must* be set because the
package name is not always exactly the same as in gentoo. Currently
euscan tries to guess it, but it is not always accurate.
Most of the time, if remote-id is set, we don't need version-scan
because upstream provides a stable API to list versions.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] RFC: upstream/watch in metadata.xml

2012-02-13 Thread Corentin Chary
On Mon, Feb 13, 2012 at 10:52 AM, Robin H. Johnson robb...@gentoo.org wrote:
 On Mon, Feb 13, 2012 at 10:33:11AM +0100, Corentin Chary wrote:
 Currently, if you run euscan on this package, it doesn't work at all:
 http://euscan.iksaif.net/package/media-plugins/vdr-softdevice/
 1/ it's hosted on gentoo mirrors, and scanning them takes too long
 because all files are in the same directory
 I've been wondering if it would help to have a pregenerated index go out
 to the mirrors from our master box, would that be useful for you?

Would be better, but the index would still be pretty big (and
currently euscan doesn't cache anything, maybe it should).


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] RFC: upstream/watch in metadata.xml

2012-02-13 Thread Corentin Chary
On Mon, Feb 13, 2012 at 10:59 AM, Corentin Chary
corentin.ch...@gmail.com wrote:
 On Mon, Feb 13, 2012 at 10:52 AM, Robin H. Johnson robb...@gentoo.org wrote:
 On Mon, Feb 13, 2012 at 10:33:11AM +0100, Corentin Chary wrote:
 Currently, if you run euscan on this package, it doesn't work at all:
 http://euscan.iksaif.net/package/media-plugins/vdr-softdevice/
 1/ it's hosted on gentoo mirrors, and scanning them takes too long
 because all files are in the same directory
 I've been wondering if it would help to have a pregenerated index go out
 to the mirrors from our master box, would that be useful for you?

 Would be better, but the index would still be pretty big (and
 currently euscan doesn't cache anything, maybe it should).


Note that even with a HTTP cache, scanning it would take a lot of CPU
if it is too big :/.

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] euscan-0.1.0 released

2011-12-04 Thread Corentin Chary
I released eucsan-0.1.0 last week and hwoarang uploaded the associated
ebuild. You can now emerge it instead of euscan-.

Note that this is only the standalone utility, not the web interface.

I you have an overlay, you can run eix --in-overlay my-overlay
--only-names | xargs euscan to scan only your packages (and you can
use sys-process/parallel instead of xargs).
Nothing really new in 0.1.0 except that this is the first release.

I don't know what could be added to the command line utility for the
next version, except maybe new site handlers (sourceforge, cpan,
mysql, etc..). Feel free to write some, it's really easy even with
basic python skills.

On the other side I have a long list of thing in my TODO for the web
interface, including:
- Using celery task queue for on-demand scanning
- Better overlay integration (adding an overlay from the admin, etc...)
- Custom commands (euscaninit to build a local rootfs with portage and
layman stuff, scan a specific category/herd/maintainer/package)
- Migration to gentoo-infra ? :)


- euscan on iksaif.net: http://euscan.iksaif.net
- euscan for chromium-os: http://chromium-os.euscan.iksaif.net/
- euscan on github: https://github.com/iksaif/euscan

Thanks,
-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-11-17 Thread Corentin Chary
 Right, having a stable API and a special euscan tag on the wiki could
 solve that.


Speeking of API, I found some time to play with django-piston this
week, and here is the result:

- http://euscan.iksaif.net/about/api

Some examples:
- http://euscan.iksaif.net/api/1.0/packages/by-maintainer/1.json
- http://euscan.iksaif.net/api/1.0/packages/by-herd/openoffice.json
- http://euscan.iksaif.net/api/1.0/packages/by-category/app-office.xml
- http://euscan.iksaif.net/api/1.0/package/app-office/libreoffice-bin.json


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-25 Thread Corentin Chary
 Open a bug, attach your ebuilds ( are there any releases besides the
 git code? ) and I will take it from there

 Here it is: https://bugs.gentoo.org/show_bug.cgi?id=383937
 There are no released version, and the project have no homepage
 (except http://euscan.iksaif.net).

euscan ebuild is now in portage,
Thanks Markos !

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] euscan proof of concept (like debian's uscan)

2011-09-23 Thread Corentin Chary
On Mon, Sep 19, 2011 at 10:53 AM, Michał Górny mgo...@gentoo.org wrote:
 On Mon, 19 Sep 2011 10:39:11 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 ## Also update eix database, because we use eix internaly
 ## Bottleneck: disk and cpu
 ##Time: 30mn ~ 1h
 eix-update

 Using egencache to keep caches for overlays will make eix updates much
 faster.

 Here's my code for it (it uses overlays in /usr/portage/local):

 cd /usr/portage/local  \
 for O in */; do
        echo ${O}
        egencache --jobs=8 --update --update-use-local-desc --rsync \
                --repo=$(cat ${O}profiles/repo_name)
 done


Just done that, and indeed, it's much faster.

Also I switched to PostgreSQL, and since that avoids a lot of deadlock
situations when making queries in parallel, I'm now able to do that:

eix --only-names -x | gparallel --eta --load 8 --jobs 400%
--max-args=64 python manage.py scan-metadata

And now it takes less than 30mn instead of more than 1h, and more
importantly, it scales.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-21 Thread Corentin Chary
On Tue, Sep 20, 2011 at 9:04 PM, Donnie Berkholz dberkh...@gentoo.org wrote:
 On 10:00 Tue 20 Sep     , Corentin Chary wrote:
 Could someone write ebuilds for euscan and euscanwww ? It should not
 take a lot of time, but my ebuilds skills are probably not good
 enought to do that.

 Sounds like good practice for when you become a Gentoo dev. =)

Hehe :). Maybe someday, but honestly I don't have any time right now
to fill the quizz seriously.

Anyway, here is an ebuild for euscan:
http://git.iksaif.net/?p=portage.git;a=blob;f=app-portage/euscan/euscan-.ebuild;h=760f3ad025a009e3c336b1e7b08b31b72b6fec1f;hb=45b37c438f3cc20325d03ada666834b9db53d7b1
for those who want to test.
Writing one for euscanwww is probably less interesting since anyway
you'll have to configure most of it by hand (database, cron, etc..)
and it's clearly not something more than 2-3 people will install.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-21 Thread Corentin Chary
On Wed, Sep 21, 2011 at 12:58 PM, Markos Chandras hwoar...@gentoo.org wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA512

 On 09/21/11 11:57, Corentin Chary wrote:
 On Tue, Sep 20, 2011 at 9:04 PM, Donnie Berkholz
 dberkh...@gentoo.org wrote:
 On 10:00 Tue 20 Sep     , Corentin Chary wrote:
 Could someone write ebuilds for euscan and euscanwww ? It
 should not take a lot of time, but my ebuilds skills are
 probably not good enought to do that.

 Sounds like good practice for when you become a Gentoo dev. =)

 Hehe :). Maybe someday, but honestly I don't have any time right
 now to fill the quizz seriously.

 Anyway, here is an ebuild for euscan:
 http://git.iksaif.net/?p=portage.git;a=blob;f=app-portage/euscan/euscan-.ebuild;h=760f3ad025a009e3c336b1e7b08b31b72b6fec1f;hb=45b37c438f3cc20325d03ada666834b9db53d7b1


 for those who want to test.
 Writing one for euscanwww is probably less interesting since
 anyway you'll have to configure most of it by hand (database, cron,
 etc..) and it's clearly not something more than 2-3 people will
 install.

 Hi,

 Open a bug, attach your ebuilds ( are there any releases besides the
 git code? ) and I will take it from there

Here it is: https://bugs.gentoo.org/show_bug.cgi?id=383937
There are no released version, and the project have no homepage
(except http://euscan.iksaif.net).
 Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-20 Thread Corentin Chary
On Mon, Sep 19, 2011 at 9:23 PM, Hans de Graaff gra...@gentoo.org wrote:
 On Wed, 2011-08-31 at 15:41 +0200, Corentin Chary wrote:
 Hi,

 some news about euscan (still available at http://euscan.iksaif.net)

 - New design (yay !)
 - Atom feeds available for each herd/category/maintainer/package
 (http://euscan.iksaif.net/maintainers/59/feed/)
 - Specific handlers for PyPi, RubyGems, pecl and PEAR packages (check
 http://git.iksaif.net/?p=euscan.git;a=tree;f=pym/euscan/handlers;h=9a995dfcebe6beecce71851abb84a875cf6e5979;hb=HEAD
 ).

 Now, maybe we should find a way to integrate that with the GSoC
 statistic project and with http://packages.gentoo.org/ (like done at
 http://packages.qa.debian.org/p/php-net-ipv4.html ). A quick way would
 be to host euscan on a gentoo server, and add some webservices to
 publish the data in json or xml, then packages.gentoo.org and others
 could parse that and display it.

 One integration that might also be useful is what we track on the ruby
 wiki currently: https://overlays.gentoo.org/proj/ruby/wiki/PendingBumps
 which is a collection of quick notes on problems encountered while
 bumping packages. The ability to attach notes to packages and to
 acknowledge version bumps (and e.g. make them yellow instead of red)
 could help with larger lists for herds.

 Kind regards,

 Hans


Yep,
You can already use that page: http://euscan.iksaif.net/herds/ruby/
And ruby packages are well tracked by euscan thanks to the rubygem
site handler, see how it helped to find new upstream versions while
http://euscan.iksaif.net/herds/ruby/charts/versions-monthly.png

Could someone write ebuilds for euscan and euscanwww ? It should not
take a lot of time, but my ebuilds skills are probably not good
enought to do that.

Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] euscan proof of concept (like debian's uscan)

2011-09-19 Thread Corentin Chary
On Mon, Sep 19, 2011 at 9:35 AM, Dirkjan Ochtman d...@gentoo.org wrote:
 On Mon, Sep 19, 2011 at 00:27, Paweł Hajdan, Jr.
 phajdan...@gentoo.org wrote:
 Okay, I think this is pretty cool and we should find it a new home in
 the Gentoo infrastructure.

 I was thinking about http://qa-reports.gentoo.org/ with the repo at
 http://git.overlays.gentoo.org/gitweb/?p=proj/qa-scripts.git;a=summary

 I can act as a proxy committer and reviewer for that code. Could you
 break it up into some smaller parts (preferably backend first) and send
 to me for review (if you're interested)?

 How long does it take to generate the reports?

 +1 I think it would be good to run this on Gentoo infra, and I
 wouldn't mind helping out.

 Bikeshedding: not sure reports is the best name for this, as reports
 implies something more static?

Here is how it works, each week I launch this script on lt server.
I've got ~30 trees installed with layman. The server is an AMD X2
4600+ with 4GB of RAM and two 80GB HD in raid1 using ext4. My network
bandwidth is 20Mbps down 1Mbps up.

#!/bin/sh

## Setup some vars to use local portage tree
export PATH=${HOME}/euscan/bin:${PATH}
export PYTHONPATH=${HOME}/euscan/pym:${PYTHONPATH}
export PORTAGE_CONFIGROOT=${HOME}/local
export ROOT=${HOME}/local
export EIX_CACHEFILE=${HOME}/local/var/cache/eix

## Go to euscanwww dir
cd ${HOME}/euscan/euscanwww/

## Update local trees
## Bottleneck: disk and network bandwidth
## Time: less than 30mn
emerge --sync --root=${ROOT} --config-root=${PORTAGE_CONFIGROOT}
ROOT=/ layman -S --config=${ROOT}/etc/layman/layman.cfg

## Also update eix database, because we use eix internaly
## Bottleneck: disk and cpu
##Time: 30mn ~ 1h
eix-update

## Scan portage (packages, versions)
## Bottleneck: disk and cpu
## Time:  15mn
## Note: this script uses eix to get a list of packages and versions
python manage.py scan-portage --all --purge-versions --purge-packages

## Scan metadata (herds, maintainers, homepages, ...)
## Bottleneck: disk
## Time: 1h ~ 1h30
## Note: this script uses gentoolkit to fetch metadata
python manage.py scan-metadata --all --progress

## Scan uptsream packages
## Bottleneck: disk, network bandwidth and latency, cpu
## Time: up to 6h
## Note: euscan is called on each package. euscan has a slow startup
caused by gentoolkit/portage.
##  gparallel is used here to limit the load caused by euscan,
and to launch up to 16 euscan instances at a time on this machine
##  this part is the longest, but scale very well
eix --only-names -x | gparallel --load 4 --jobs 800% euscan 
${HOME}/logs/euscan-upstream.log
python manage.py scan-upstream --feed --purge-versions 
${HOME}/logs/euscan-upstream.log

## Update counters (6)
## Time: some minutes
## Bottleneck: cpu
## Note: this script could probably be implemented faster using raw SQL queries
python manage.py update-counters


 Also not sure how much it has to do
 with QA.
 How much of it constitutes the backend, in your opinion? It seems
 there are two parts, right now:

 1. euscan script, to find new versions for a single package
 2. the django www app, including storage for the version data

Yes, exactly. Here is how the tree is structured currently:

euscan script

bin/ -- contains the euscan python binary
pym/ -- contains most of the code used by the euscan script
pym/euscan/handlers -- contains specific site handlers (rubygems,
pypi, pecl, pear, ..)

euscanwww django app

euscanwww/ -- contains all the stuff for the django application, all
the django application needs is a working portage tree and euscan
available in the $PATH

 IMO it would be nice to have a somewhat generic REST-style service
 exposing the data, and build a simple UI on top of that. In
 particular, I have different ideas about what the UI should look like,
 so it would be nice if different people could experiment (and/or
 integrate in other services like znurt.org).

I already added some very dummy json formating (note that it also
exposes internal key id, which is bad, but I just wanted to
experiment).
All you need is to append /json to an url. For example:

- http://euscan.iksaif.net/maintainers/4/json
- http://euscan.iksaif.net/package/app-accessibility/brltty/json

This could be a lot better, we just need to define an API and the
implementation will be easy.


A first step would be to make an ebuild for euscan, and another for
euscanwww so that anyone can easilly install it and play with it.
Feel free to ping me on irc, I'm on #gentoo-sunrise, my nickname is iksaif.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] euscan proof of concept (like debian's uscan)

2011-09-19 Thread Corentin Chary
On Mon, Sep 19, 2011 at 10:53 AM, Michał Górny mgo...@gentoo.org wrote:
 On Mon, 19 Sep 2011 10:39:11 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 ## Also update eix database, because we use eix internaly
 ## Bottleneck: disk and cpu
 ##Time: 30mn ~ 1h
 eix-update

 Using egencache to keep caches for overlays will make eix updates much
 faster.

 Here's my code for it (it uses overlays in /usr/portage/local):

 cd /usr/portage/local  \
 for O in */; do
        echo ${O}
        egencache --jobs=8 --update --update-use-local-desc --rsync \
                --repo=$(cat ${O}profiles/repo_name)
 done

Thanks, I'll try that.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-07 Thread Corentin Chary
On Tue, Sep 6, 2011 at 8:20 PM, Matt Turner matts...@gentoo.org wrote:
 On Sun, Apr 3, 2011 at 1:20 PM, Corentin Chary corentin.ch...@gmail.com 
 wrote:
 Hi again,

 Found a little problem: it's not finding a newer version of
 wireless-regdb, which uses a date-based versioning system. If euscan
 tried to view/parse the directory index where the distfiles are
 located, it would find this new file.

 http://euscan.iksaif.net/package/net-wireless/wireless-regdb/

Hi,
If you try to run euscan on that, you'll get:

  * Url doesn't seems to depend on version: 20101124 not found in 
 http://wireless.kernel.org/download/wireless-regdb/wireless-regdb-2010.11.24.tar.bz2

What's happenning here is that the version can be found in the url, so
even a directory scan won't find anything because it won't be able to
match files.
It would work correctly if the version was 2010.11.24 or the file was
named wireless-regdb-20101124.tar.bz2.

Unfortunatly, making that work would probably introduce a lot of false
positive in other cases.

Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-01 Thread Corentin Chary
On Thu, Sep 1, 2011 at 9:23 AM, Alex Legler a...@gentoo.org wrote:
 On Wednesday 31 August 2011 15:41:51 Corentin Chary wrote:
 Hi,

 some news about euscan (still available at http://euscan.iksaif.net)

 - New design (yay !)

 Glad you like it. Be sure to credit where you got it from, though.

Sorry, that was done in the dev version, but I forgot to push it
(http://euscan.iksaif.net/about/).
Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-01 Thread Corentin Chary
 Btw I have feature request, could it remember the sorting method i set?
 (so I don't have to click and reorder it every time i refresh)

 Per-page or globally ?

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-01 Thread Corentin Chary
On Thu, Sep 1, 2011 at 10:31 AM, Michał Górny mgo...@gentoo.org wrote:
 On Wed, 31 Aug 2011 15:41:51 +0200
 Corentin Chary corentin.ch...@gmail.com wrote:

 - Specific handlers for PyPi, RubyGems, pecl and PEAR packages (check
 http://git.iksaif.net/?p=euscan.git;a=tree;f=pym/euscan/handlers;h=9a995dfcebe6beecce71851abb84a875cf6e5979;hb=HEAD
 ).

 AFAICS that specific handlers are required to grab a list of versions.
 Is it or will it be possible to add a kind of semi-handlers which would
 just grab a list of all URIs (e.g. on github project) and let euscan
 match them with SRC_URI?

Yep, it's possible, you can do some specific stuff and import
functions from euscan.helpers for euscan.handlers.generic to do a
generic scan, then return a correct list of version.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-01 Thread Corentin Chary
2011/9/1 Tomáš Chvátal scarab...@gentoo.org:
 Dne 1.9.2011 09:55, Corentin Chary napsal(a):

 Btw I have feature request, could it remember the sorting method i set?
 (so I don't have to click and reorder it every time i refresh)

  Per-page or globally ?

 I would say globaly i smore sane here

I did that per-page, as it was only one  line to add, I'll try to do
it globally later.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-09-01 Thread Corentin Chary
On Wed, Aug 31, 2011 at 3:41 PM, Corentin Chary
corentin.ch...@gmail.com wrote:
 Hi,

 some news about euscan (still available at http://euscan.iksaif.net)

 - New design (yay !)
 - Atom feeds available for each herd/category/maintainer/package
 (http://euscan.iksaif.net/maintainers/59/feed/)
 - Specific handlers for PyPi, RubyGems, pecl and PEAR packages (check
 http://git.iksaif.net/?p=euscan.git;a=tree;f=pym/euscan/handlers;h=9a995dfcebe6beecce71851abb84a875cf6e5979;hb=HEAD
 ).

A quick example of what a custom site handler can bring (rubygems
here): http://euscan.iksaif.net/categories/dev-ruby/charts/versions-weekly.png
:)

 Now, maybe we should find a way to integrate that with the GSoC
 statistic project and with http://packages.gentoo.org/ (like done at
 http://packages.qa.debian.org/p/php-net-ipv4.html ). A quick way would
 be to host euscan on a gentoo server, and add some webservices to
 publish the data in json or xml, then packages.gentoo.org and others
 could parse that and display it.

 Thanks,

 --
 Corentin Chary
 http://xf.iksaif.net




-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-08-31 Thread Corentin Chary
Hi,

some news about euscan (still available at http://euscan.iksaif.net)

- New design (yay !)
- Atom feeds available for each herd/category/maintainer/package
(http://euscan.iksaif.net/maintainers/59/feed/)
- Specific handlers for PyPi, RubyGems, pecl and PEAR packages (check
http://git.iksaif.net/?p=euscan.git;a=tree;f=pym/euscan/handlers;h=9a995dfcebe6beecce71851abb84a875cf6e5979;hb=HEAD
).

Now, maybe we should find a way to integrate that with the GSoC
statistic project and with http://packages.gentoo.org/ (like done at
http://packages.qa.debian.org/p/php-net-ipv4.html ). A quick way would
be to host euscan on a gentoo server, and add some webservices to
publish the data in json or xml, then packages.gentoo.org and others
could parse that and display it.

Thanks,

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-05-05 Thread Corentin Chary
New dynamic charts:

- http://euscan.iksaif.net/maintainers/2/charts/versions-weekly.png
- 
http://euscan.iksaif.net/categories/app-accessibility/charts/packages-weekly.png
- 
http://euscan.iksaif.net/categories/app-accessibility/charts/packages-weekly-small.png

Feel free to use them in a plasma-applet, or whatever you want !



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-05-02 Thread Corentin Chary
On Thu, Apr 28, 2011 at 2:34 PM, Corentin Chary
corentin.ch...@gmail.com wrote:
 2011/4/27 Tomáš Chvátal scarab...@gentoo.org:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Dne 27.4.2011 10:04, Corentin Chary napsal(a):
 I updated http://euscan.iksaif.net yesterday.

 I also added some overlays, and I now use local portage portage trees
 instead of system trees (a script to do that is available in
 euscanwww/script).

 New features:
 - Handle overlays correctly (special column for overlays)
 - Add some charts (small bar charts in table, pies at the bottom)
 - Less false positives with euscan

 I'll try to add all time line charts this week.

 Thanks,
 Found another issue, package stays there even if it was removed from the
 portage tree (kde-misc/plasmaboard is a good example).

 Good catch, I'll add a --purge option to scan-portage.

Done, 
http://git.iksaif.net/?p=euscan.git;a=commitdiff;h=120ae425af82061fee4e0d7f18104595c551a144;hp=244e7d64fdc6ddeb31ad54dfcc415c50672bdebb


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-28 Thread Corentin Chary
2011/4/27 Tomáš Chvátal scarab...@gentoo.org:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Dne 27.4.2011 10:04, Corentin Chary napsal(a):
 I updated http://euscan.iksaif.net yesterday.

 I also added some overlays, and I now use local portage portage trees
 instead of system trees (a script to do that is available in
 euscanwww/script).

 New features:
 - Handle overlays correctly (special column for overlays)
 - Add some charts (small bar charts in table, pies at the bottom)
 - Less false positives with euscan

 I'll try to add all time line charts this week.

 Thanks,
 Found another issue, package stays there even if it was removed from the
 portage tree (kde-misc/plasmaboard is a good example).

Good catch, I'll add a --purge option to scan-portage.


-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-27 Thread Corentin Chary
I updated http://euscan.iksaif.net yesterday.

I also added some overlays, and I now use local portage portage trees
instead of system trees (a script to do that is available in
euscanwww/script).

New features:
- Handle overlays correctly (special column for overlays)
- Add some charts (small bar charts in table, pies at the bottom)
- Less false positives with euscan

I'll try to add all time line charts this week.

Thanks,
-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-20 Thread Corentin Chary
2011/4/19 Tomáš Chvátal scarab...@gentoo.org:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Dne 16.4.2011 08:29, Corentin Chary napsal(a):
 New website is up and running at

 http://euscan.iksaif.net/

 The git tree is still at http://git.iksaif.net/?p=euscan.git;a=summary

 TODO:
 - Make some charts to see how it's going
 - Finish the scan my world feature
 - Add a way to subsribe to herds/maintainer/packages in order to
 receive weekly/monthly reports

 I'll gladly accept any patch ! :)

 Hello found another issue,

 http://euscan.iksaif.net/package/media-video/kmplayer/

 0.11.2c IS newer than 0.11.2 :)

 Cheers

Should be fixed this evening. I'll now use portage.vercmp() by
default, and fallback to pkg_resources.
Thanks for the report.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-19 Thread Corentin Chary
On Mon, Apr 18, 2011 at 7:45 PM, Donnie Berkholz dberkh...@gentoo.org wrote:
 On 15:48 Mon 18 Apr     , Corentin Chary wrote:
 On Mon, Apr 18, 2011 at 3:32 PM, Donnie Berkholz dberkh...@gentoo.org 
 wrote:
  - Another useful thing would be a way to supply CPV tokens for any
   unstable upstream series that we'll never add to the tree.

 There is already some kind of blacklist in euscan. Could you provide
 examples so I can implement that ?

 Sorry, forgot about this bit. I'm just talking about standard package
 tokens, the same kind used in package.mask or wherever:

 =x11-drivers/xf86-video-intel-2.14.90*
=x11-base/xorg-server-1.10.900

I'll try to implement that,
Thanks for the suggestion.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-19 Thread Corentin Chary
 The list is missing amarok-2.4.0.90. You need to look both on stable and
 unstable trees to find all versions.
 ftp://ftp.kde.org/pub/kde/unstable/amarok/2.4.0.90/src

Hum, it's hard to guess the unstable url from the stable one
automatically. But maybe a custom site handler for ftp.kde.org would
do it.

 http://euscan.iksaif.net/package/dev-db/mariadb/

 You're missing quite a few versions of mariadb. They can be found in the
 official mysql overlay (
 http://git.overlays.gentoo.org/gitweb/?p=proj/mysql.git;a=tree;f=dev-db/mariadb
 )

I'll add some officials overlay soon.

 http://downloads.askmonty.org/mariadb/5.1/
 http://downloads.askmonty.org/mariadb/5.2/

The 5.2 is hard to find, because all mirrors redirect to
http://downloads.askmonty.org/mariadb/; when trying to list the top
directory. And I can't guess the version from this page.

Brute force would be able to find it, if 5.2.0 archive was still available.

 http://euscan.iksaif.net/package/dev-db/mysql/

 You're missing many of the new releases of mysql.

MySQL case may be easier to deal with than MariaDB;

I added these to my TODO. I'll try to find a generic way, but I don't
know if there is really one.

Thanks !
-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-19 Thread Corentin Chary
On Tue, Apr 19, 2011 at 9:29 AM, Dirkjan Ochtman d...@gentoo.org wrote:
 On Wed, Apr 13, 2011 at 08:47, Corentin Chary corentin.ch...@gmail.com 
 wrote:
 Better, what about a per-herd/per-maintainer rss feed ?

 This is what I'd most want, would be very cool to have.

 Perhaps it would be possible for some classes of packages to fine-tune
 the algorithm, e.g. check PyPI for Python things, PECL for PHP things,
 etc.


Thanks,
Added to TODO, but if you got some time, feel free to give it a try :).

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-18 Thread Corentin Chary
On Mon, Apr 18, 2011 at 10:26 AM, Marijn hk...@gentoo.org wrote:
 Nice work!

Thanks !

Fyi: I ran a new scan this morning. It only took 3 hours !

GNU's Parallel is really a great tool:
`python manage.py list-packages | gparallel --eta --progress --jobs
400% euscan | python manage.py scan-upstream --feed  /dev/null`

I'll try to implement charts as soon as I've got enought data.
-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-18 Thread Corentin Chary
On Mon, Apr 18, 2011 at 3:32 PM, Donnie Berkholz dberkh...@gentoo.org wrote:
 On 06:29 Sat 16 Apr     , Corentin Chary wrote:
 New website is up and running at

 http://euscan.iksaif.net/

 The git tree is still at http://git.iksaif.net/?p=euscan.git;a=summary

 TODO:
 - Make some charts to see how it's going
 - Finish the scan my world feature
 - Add a way to subsribe to herds/maintainer/packages in order to
 receive weekly/monthly reports

 I'll gladly accept any patch ! :)

 This is cool! I just looked through the x11-team packages and it seems
 very useful already. I have a few suggestions if you have time. =)

 - Some teams have official overlays. Supporting those as an additional
  source of ebuilds would be pretty nice.

I'll just checkout these trees on my machine (and only enable them for euscan).
euscan use portage, gentoolkit and eix internaly, so overlays are not
a problem :).

Where could I get a list of official overlays that should be added ?

 - Another useful thing would be a way to supply CPV tokens for any
  unstable upstream series that we'll never add to the tree.

There is already some kind of blacklist in euscan. Could you provide
examples so I can implement that ?

 - This looks like a problem (a versioned tarball named cairo-5c):
  http://euscan.iksaif.net/package/x11-libs/cairo/

Hum ... yes, I'll try to use better regex :).

Thanks !
-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-18 Thread Corentin Chary
On Mon, Apr 18, 2011 at 1:48 PM, Corentin Chary
corentin.ch...@gmail.com wrote:
 On Mon, Apr 18, 2011 at 3:32 PM, Donnie Berkholz dberkh...@gentoo.org wrote:
 On 06:29 Sat 16 Apr     , Corentin Chary wrote:
 New website is up and running at

 http://euscan.iksaif.net/

 The git tree is still at http://git.iksaif.net/?p=euscan.git;a=summary

 TODO:
 - Make some charts to see how it's going
 - Finish the scan my world feature
 - Add a way to subsribe to herds/maintainer/packages in order to
 receive weekly/monthly reports

 I'll gladly accept any patch ! :)

 This is cool! I just looked through the x11-team packages and it seems
 very useful already. I have a few suggestions if you have time. =)

 - Some teams have official overlays. Supporting those as an additional
  source of ebuilds would be pretty nice.

 I'll just checkout these trees on my machine (and only enable them for 
 euscan).
 euscan use portage, gentoolkit and eix internaly, so overlays are not
 a problem :).

 Where could I get a list of official overlays that should be added ?

 - Another useful thing would be a way to supply CPV tokens for any
  unstable upstream series that we'll never add to the tree.

 There is already some kind of blacklist in euscan. Could you provide
 examples so I can implement that ?

 - This looks like a problem (a versioned tarball named cairo-5c):
  http://euscan.iksaif.net/package/x11-libs/cairo/

 Hum ... yes, I'll try to use better regex :).

Fixed in 
http://git.iksaif.net/?p=euscan.git;a=commitdiff;h=a7a15c0ac72178bf2f3a73f94a70113ed9856e5c;hp=e5278e0e0f8cf617dff15f84540d0d6604081e82



-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-18 Thread Corentin Chary
On Mon, Apr 18, 2011 at 5:22 PM, Daniel Pielmeier bil...@gentoo.org wrote:
 Corentin Chary schrieb am 18.04.2011 11:05:

 Fyi: I ran a new scan this morning. It only took 3 hours !

 GNU's Parallel is really a great tool:
 `python manage.py list-packages | gparallel --eta --progress --jobs
 400% euscan | python manage.py scan-upstream --feed  /dev/null`

 I'll try to implement charts as soon as I've got enought data.

 I really like it. Keep up the good work. It gives a warm and fuzzy
 feeling if there are no unpackaged releases for the own packages :)

 There is one issue I recognized. How do you get the links for the
 unpackaged upstream versions and do you verify if the listed tarballs
 exist? I am asking because for app-editors/bluefish [1] there are eleven
 unpackaged versions listed but only sources for the three 2.0.3 releases
 exist and there are no releases for 2.1.x and 2.2.x.

 http://euscan.iksaif.net/package/app-editors/bluefish/

For bluefish, it's because when you try to download
http://www.bennewitz.com/bluefish/stable/source/bluefish-2.2.3.tar.bz2
it redirects you to an earlier version.

Fixed here: 
http://git.iksaif.net/?p=euscan.git;a=commitdiff;h=d74b2c505619306ab1ce3d4af3f66084b4faeed9;hp=a7a15c0ac72178bf2f3a73f94a70113ed9856e5c

Thanks !

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-16 Thread Corentin Chary
New website is up and running at

http://euscan.iksaif.net/

The git tree is still at http://git.iksaif.net/?p=euscan.git;a=summary

TODO:
- Make some charts to see how it's going
- Finish the scan my world feature
- Add a way to subsribe to herds/maintainer/packages in order to
receive weekly/monthly reports

I'll gladly accept any patch ! :)

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-14 Thread Corentin Chary
On Wed, Apr 13, 2011 at 4:58 PM, Corentin Chary
corentin.ch...@gmail.com wrote:
 Hi Corentin,

 Do you have a public repo for your django code/site available? I would
 really enjoy taking a look at what you have here, sounds cool.

 Thanks,
 Matt

Just pushed scan-upstream command (with --parallel support !).
Now, it's time to do a full scan and implement the views.
I staying on #gentoo-unregistered (iksaif) if someone want to get
involved in euscan :).

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-13 Thread Corentin Chary
On Sun, Apr 10, 2011 at 8:43 AM, Hans de Graaff gra...@gentoo.org wrote:
 On Sun, 2011-04-03 at 17:20 +, Corentin Chary wrote:

 misc:
 - automatic bug report
 - automatic email report for maintainer/herds

 I'm not sure if this makes sense. For example, it gets dev-lang/ruby
 wrong, thinking that our old patch files are missing versions. Also, for
 exiftool I only add production releases, so there will almost always be
 newer upstream intermediate releases. My point here is that there are
 often additional considerations in considering new releases, so fully
 automating this is going to be hard. Perhaps a weekly/monthly opt-in
 mail to the herd/maintainer with an overview would be useful?

Better, what about a per-herd/per-maintainer rss feed ?

 The website is nice to have as another source of scanning for updates,
 bookmarked.

The current site won't be updated, I'm working on a new django-based
website that should be up and running next week.
The url will probably be euscan.iksaif.net, but I'll post it here.

-- 
Corentin Chary
http://xf.iksaif.net



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-13 Thread Corentin Chary
 Hi Corentin,

 Do you have a public repo for your django code/site available? I would
 really enjoy taking a look at what you have here, sounds cool.

 Thanks,
 Matt

Hi,
Yes, my git repo is here: http://git.iksaif.net/?p=euscan.git;a=summary
Current dependencies are: eix-0.22.5, gentoolkit-0.3.0 and portage of course.

I didn't took the time to write any kind of documentation, but here is
what you'll find in the repo:
- euscan standalone utility
- euscanwww

Currently euscanwww only have basic templates, models and view. But
you can alreay run these commands to fill the database:

$ emerge --sync
$ eix-update
$ python manage.py scan-portage --all
$ python manage.py scan-metadata --all

I'll implement scan-upstream soon.

The old index.php is still available in earlier revisions.

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-04-03 Thread Corentin Chary
Hi again,

I found the time to work a little more on euscan:

- Git tree: http://git.iksaif.net/?p=euscan.git
- Demo web interface: http://xf.iksaif.net/dev/.euscan/web/

To feed the web interface, I created a quickdirty euscan-update. It
currently depends on eix-0.22.5 and gentoolkit-0.3.0.

I launched a full portage scan yesterday, and as you can see on web
interface, it seems to work. You can browse by category, herd, and
maintainer.
The scan is not finished yet (currently scanning net-misc) and keep in
mind that euscan doesn't support mirror:// urls (although, it would be
easy to support some of them).

But there is still a lot to do to make something useful with that, and
I said before, I don't have a lot of time allocated to that project,
so any help would be great.

Quick TODO and ideas:
euscan:
- clean euscan and add options parsing capabilities, with tons of options
- support mirror:// (when possible)
- fix known bugs (see BUGS)

euscan-update:
- clean
- optimize and parallelize
- remove gentoolkit dependency, handle overlays, handle cumulative scans

web interface:
- display all kind of informations (metadatas, keywords, etc...)
- create charts and statistics
- mark/report false positives
- display euscan-update status
- display last euscan output message
- show when euscan can/cannot guess upstream version for a given package

misc:
- automatic bug report
- automatic email report for maintainer/herds



Re: [gentoo-dev] Re: euscan proof of concept (like debian's uscan)

2011-03-26 Thread Corentin Chary
On Sat, Mar 26, 2011 at 5:35 PM, Christian Faulhammer fa...@gentoo.org wrote:
 Hi,

 Corentin Chary corentin.ch...@gmail.com:
 I started this mostly to see if it was possible, and I don't know if
 i'll have the time to continue to work on that project, but I think
 gentoo really needs an automated way to detect outdated packages.
 This could also be a 2011 GSOC project (finishing euscan, and adding a
 web based interface to browse the results).

  Have you applied for a place in GSoC?

 V-Li

I can't apply to GSoC, I was suggesting that another student could work on that.

-- 
Corentin Chary
http://xf.iksaif.net



[gentoo-dev] euscan proof of concept (like debian's uscan)

2011-03-21 Thread Corentin Chary
Hi,

I recently started working on a small gentoo utility named euscan
(for Ebuild Upstream Scan)
For those who don't know debian's uscan, it allows to scan upstream
for new versions. It's used by packages.qa.debian.org (example:
http://packages.qa.debian.org/p/php-net-ipv4.html ).
It's available at: http://xf.iksaif.net/bordel/euscan

Currently, it uses two heuristics to find new versions, both based on
SRC_URI and PV:
- Directory scanning: scan directories to find files with newer version
- Brute Force: generate new possible versions, and try to download files

Note that it also works when only a part of the version is available
in the url.

I think that it would be great to have these informations on
http://packages.gentoo.org/ and/or unofficial
http://gentoo-portage.com/ website.

We could also add the ability to browse packages by maintainer to help
them see if they have any outdated package.

I started this mostly to see if it was possible, and I don't know if
i'll have the time to continue to work on that project, but I think
gentoo really needs an automated way to detect outdated packages.
This could also be a 2011 GSOC project (finishing euscan, and adding a
web based interface to browse the results).

Examples:

$ ./euscan PEAR-Validate-0.8.3
Package: dev-php/PEAR-Validate-0.8.3
Herd: php
Maintainer: none
Location: /usr/portage/dev-php/PEAR-Validate
 * Scanning: http://pear.php.net/get/Validate-${PV}.tgz
 * Scanning: http://pear.php.net/get
 * Generating version from 0.8.3
 * Brute forcing: http://pear.php.net/get/Validate-${PV}.tgz
 * Trying: http://pear.php.net/get/Validate-0.8.4.tgz ...               [ ok ]
 * Trying: http://pear.php.net/get/Validate-0.8.5.tgz ...               [ !! ]
 * Trying: http://pear.php.net/get/Validate-0.9.0.tgz ...               [ !! ]
 * Trying: http://pear.php.net/get/Validate-0.10.0.tgz ...              [ !! ]
 * Trying: http://pear.php.net/get/Validate-0.8.6.tgz ...               [ !! ]
New Upstream Version: 0.8.4 http://pear.php.net/get/Validate-0.8.4.tgz

$ ./euscan icu4j-4.4.2
Package: dev-java/icu4j-4.4.2
Herd: java
Maintainer: none
Location: /usr/portage/dev-java/icu4j
 * Scanning: 
http://download.icu-project.org/files/icu4j/${PV}/icu4j-4_4_2-src.jar
 * Scanning: http://download.icu-project.org/files/icu4j
New Upstream Version: 4.5.1 http://download.icu-project.org/files/icu4j/4.5.1/
New Upstream Version: 4.5.2 http://download.icu-project.org/files/icu4j/4.5.2/
New Upstream Version: 4.6 http://download.icu-project.org/files/icu4j/4.6/
New Upstream Version: 4.6rc2 http://download.icu-project.org/files/icu4j/4.6rc2/
New Upstream Version: 4.7.1 http://download.icu-project.org/files/icu4j/4.7.1/
New Upstream Version: 4.6.1 http://download.icu-project.org/files/icu4j/4.6.1/

$ ./euscan IceE-1.3.0-r1
Package: dev-cpp/IceE-1.3.0-r1
Herd: no-herd
Maintainer: maintainer-nee...@gentoo.org
Description: The Internet Communications Engine (Ice) is a modern
object-oriented middleware with support for C++, .NET, Java, Python,
Ruby, and PHP
Location: /usr/portage/dev-cpp/IceE
 * Scanning: 
http://www.zeroc.com/download/IceE/${0}.${1}//IceE-${PV}-linux.tar.gz
 * Scanning: http://www.zeroc.com/download/IceE
 * Generating version from 1.3.0
 * Brute forcing:
http://www.zeroc.com/download/IceE/${0}.${1}//IceE-${PV}-linux.tar.gz
 * Trying: http://www.zeroc.com/download/IceE/1.3//IceE-1.3.1-linux.tar.gz
...      [ !! ]
 * Trying: http://www.zeroc.com/download/IceE/1.3//IceE-1.3.2-linux.tar.gz
...      [ !! ]
 * Trying: http://www.zeroc.com/download/IceE/1.4//IceE-1.4.0-linux.tar.gz
...      [ !! ]
 * Trying: http://www.zeroc.com/download/IceE/1.5//IceE-1.5.0-linux.tar.gz
...      [ !! ]

Thanks,
-- 
Corentin Chary
http://xf.iksaif.net