On Tue, 2006-07-11 at 16:22 +0200, Thomas Klausner wrote:
> > One thing that'd be useful to have a small module for would be parsing 
> > the Changes files and show just the changes in the last version.   It'd 
> >  be a bit of work to make (does the file go backwards or forwards?  How 
> > are the versions separated? etc etc), but it should be doable to get it 
> > to work for 90% of the CPAN distributions.
> 
> Such a module would indeed be great. I was toying with the idea of
> writing a small script that compares new cpan uploads with stuff
> installed at my machine(s) and reports new dists and what changed.
> 
> But I never got around writing it, as changes-parsing seems quite
> futile.

We wouldn't need full fledged parsing, I think. It seems safe to assume
that most changelogs are visually blocked and from sifting through a few
examples a split using qr{?x \r?\n \s* \r?\n \w} and just 
qr{?x \r?\n \w} as fallback when this yields no results should separate
the version blocks. Scanning these blocks from start and end (using
@blocks[1,2,3,-1,4,-2,5, ...]) until a version id is found that matches
the newest version should be able to identify the correct changeset.

Do you think this approach is feasible and if yes, how can I access a
large enough body of changelogs to test and refine it?

Oh, by the way, the first version of this filter is available from
http://cpan.org/authors/id/W/WI/WILLERT/cpan-changes.pl (the name sucks
IMHO), so have a look. Another feature I added yesterday is just the one
you suggested: it can now filter out all uninstalled modules.

Cheers,
  Sebastian



Reply via email to