Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Mathieu Parent
Package: lintian
Version: 1.23.46
Severity: wishlist
Tags: patch

The files add checks for the proposal:
http://wiki.debian.org/Proposals/CopyrightFormat

I hope this implementation will help to clarify things.

To enable, copy:
- copyright-specification to /usr/share/lintian/checks/
- copyright-specification.desc to /usr/share/lintian/checks/
- DebianCopyrightParser.pm to /usr/share/lintian/lib

and check with -I

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)

Kernel: Linux 2.6.18-6-xen-686 (SMP w/2 CPU cores)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages lintian depends on:
ii  binutils2.18.1~cvs20080103-4 The GNU assembler, linker and bina
ii  diffstat1.45-2   produces graph of changes introduc
ii  dpkg-dev1.14.18  package building tools for Debian
ii  file4.23-2   Determines file type using magic
ii  gettext 0.17-2   GNU Internationalization utilities
ii  intltool-debian 0.35.0+20060710.1Help i18n of RFC822 compliant conf
ii  libparse-debianchan 1.1.1-2  parse Debian changelogs and output
ii  liburi-perl 1.35.dfsg.1-1Manipulates and accesses URI strin
ii  man-db  2.5.1-3  on-line manual pager
ii  perl [libdigest-md5 5.8.8-12 Larry Wall's Practical Extraction 

lintian recommends no packages.

-- no debconf information
# control-file -- lintian check script -*- perl -*-
#
# Copyright (C) 2004 Marc Brockschmidt
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, you can find it on the World Wide
# Web at http://www.gnu.org/copyleft/gpl.html, or write to the Free
# Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
# MA 02110-1301, USA.

package Lintian::copyright_specification;
use strict;
use lib $ENV{'LINTIAN_ROOT'}/checks/;
use common_data;
use Dep;
use DebianCopyrightParser;
use Tags;
use Data::Dumper;

sub run {

	my $pkg = shift;
	my $type = shift;

	#parse the copyright file
	my @data = DebianCopyrightParser::read_dpkg_copyright(unpacked/debian/copyright);
	my $follow_spec = 0;
	#check if it has a format-specification header
	foreach my $section (@data) {
		#next section if this is not an error
		next if $section-{'format-specification'};
		$follow_spec = 1;
		last;
	}

	#don't check other stuff if it doesn't have the format-specification header
	if ($follow_spec == 0) {
		tag debian-copyright-no-specification, '';
	} else {
		#errors found by the parser
		foreach my $section (@data) {
			#next section if this is not an error
			next if not $section-{error};
			tag debian-copyright-.$section-{error}, $section-{info};
		}

		# Check that every file in the tree has a license
		my $command = 'cd unpacked  find . -type f -a -not \( -false';
		foreach my $section (@data) {
			next if not $section-{files};
			#patterns are comma separated
			#TODO: manage quoted strings with comma in it
			$command .= ' -o -path ./'.join(' -o -path ./', split m/,\s/, $section-{files});
	
		}
		$command .= ' \) ; cd .. ';
		my $files_without_copyright = `$command`;
		if ($files_without_copyright) {
			foreach my $file (split '\n', $files_without_copyright) {
tag 'debian-copyright-file-without-copyright', $file;
			}
		}

		# Check that every pattern match something
		foreach my $section (@data) {
			#next section if this is not a files section
			next if not $section-{files};
			#files are comma separated
			#TODO: manage quoted strings (with comma in it)
			my @patterns = split m/,\s/, $section-{files} ;
			foreach my $pattern (@patterns) {
if (not `ls -l  cd unpacked  find . -type f -a -path $pattern ; cd ..`) {
	tag 'debian-copyright-section-without-match', $pattern ;
}
			}
		}
	}

}

1;

# vim: syntax=perl sw=4 ts=4 noet shiftround
Check-Script: copyright-specification
Author: Mathieu Parent [EMAIL PROTECTED]
Abbrev: csp
Type: source
Unpack-Level: 2

Tag: debian-copyright-no-specification
Type: info
Info: The package contains a copyright file that  that does not follow the
 proposed copyright format. This is not required by the policy.
 .
 More information on how to follow this proposed format at
 http://wiki.debian.org/Proposals/CopyrightFormat

Tag: 

Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Russ Allbery
Mathieu Parent [EMAIL PROTECTED] writes:

 Package: lintian
 Version: 1.23.46
 Severity: wishlist
 Tags: patch

 The files add checks for the proposal:
 http://wiki.debian.org/Proposals/CopyrightFormat

 I hope this implementation will help to clarify things.

 To enable, copy:
 - copyright-specification to /usr/share/lintian/checks/
 - copyright-specification.desc to /usr/share/lintian/checks/
 - DebianCopyrightParser.pm to /usr/share/lintian/lib

lib/Lintian/Parse/Copyright.pm or something along those lines would be
better.  I'm trying to move lintian into a proper Perl namespace habit so
that eventually we can install the lintian Perl modules into the regular
Perl module search path.

 and check with -I

It might be a bit premature to warn if people aren't using that format.
I've personally not started using it, for example, largely because it
doesn't yet have a standardized way of including all the things that
*aren't* just copyrights and licenses that belong in debian/copyright.

Checking the syntax if people are using it is certainly a good idea.

-- 
Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Mathieu PARENT
On Thu, May 1, 2008 at 10:45 PM, Russ Allbery [EMAIL PROTECTED] wrote:
 Mathieu Parent [EMAIL PROTECTED] writes:

   Package: lintian
   Version: 1.23.46
   Severity: wishlist
   Tags: patch
  
   The files add checks for the proposal:
   http://wiki.debian.org/Proposals/CopyrightFormat
  
   I hope this implementation will help to clarify things.
  
   To enable, copy:
   - copyright-specification to /usr/share/lintian/checks/
   - copyright-specification.desc to /usr/share/lintian/checks/
   - DebianCopyrightParser.pm to /usr/share/lintian/lib

  lib/Lintian/Parse/Copyright.pm or something along those lines would be
  better.  I'm trying to move lintian into a proper Perl namespace habit so
  that eventually we can install the lintian Perl modules into the regular
  Perl module search path.

Ok

   and check with -I

  It might be a bit premature to warn if people aren't using that format.
  I've personally not started using it, for example, largely because it
  doesn't yet have a standardized way of including all the things that
  *aren't* just copyrights and licenses that belong in debian/copyright.

  Checking the syntax if people are using it is certainly a good idea.

It it like it behave: it checks for the format-specification field. If it
does not  exists then it only shows an I: debian-copyright-no-specification.
If it exists, the checks go deeper. Most of the following checks gives a
warning.


  --
  Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Russ Allbery
Mathieu PARENT [EMAIL PROTECTED] writes:

 It it like it behave: it checks for the format-specification field. If
 it does not exists then it only shows an I:
 debian-copyright-no-specification.  If it exists, the checks go
 deeper. Most of the following checks gives a warning.

Making it an I: tag means that Lintian is recommending that everyone use
the new format.  Are we sure we're okay with that?  If we are, that's
fine, but I think it's worth thinking about a bit first.  I've not gotten
the impression that the new format has been adopted very widely outside of
a few scattered groups, so we're basically adding an I: tag for almost the
entire archive.

-- 
Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Mathieu PARENT
On Thu, May 1, 2008 at 11:16 PM, Russ Allbery [EMAIL PROTECTED] wrote:
 Mathieu PARENT [EMAIL PROTECTED] writes:


  Making it an I: tag means that Lintian is recommending that everyone use
  the new format.  Are we sure we're okay with that?  If we are, that's
  fine, but I think it's worth thinking about a bit first.  I've not gotten
  the impression that the new format has been adopted very widely outside of
  a few scattered groups, so we're basically adding an I: tag for almost the
  entire archive.

Agree. This bug is not really for inclusion now. It is part of the proposal
(implementation part).

Also, I'm not a DD. I've done this because I think it is important and this is
in the NM tasks (http://wiki.debian.org/NMTasks).

The choice about including it is to DD. If I have more feedback (or after lenny
release ?) I can send a mail to [EMAIL PROTECTED]

It may also be flagged experimental or notes
(http://lintian.debian.org/manual/ch2.html).



 Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Frank Lichtenheld
On Thu, May 01, 2008 at 02:16:28PM -0700, Russ Allbery wrote:
 Mathieu PARENT [EMAIL PROTECTED] writes:
  It it like it behave: it checks for the format-specification field. If
  it does not exists then it only shows an I:
  debian-copyright-no-specification.  If it exists, the checks go
  deeper. Most of the following checks gives a warning.
 
 Making it an I: tag means that Lintian is recommending that everyone use
 the new format.  Are we sure we're okay with that?  If we are, that's
 fine, but I think it's worth thinking about a bit first.  I've not gotten
 the impression that the new format has been adopted very widely outside of
 a few scattered groups, so we're basically adding an I: tag for almost the
 entire archive.

Yeah, I would leave that out, too.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478930: lintian: Check for rfc2822 debian/copyright (http://wiki.debian.org/Proposals/CopyrightFormat)

2008-05-01 Thread Frank Lichtenheld
On Thu, May 01, 2008 at 09:45:34PM +0200, Mathieu Parent wrote:
 The files add checks for the proposal:
 http://wiki.debian.org/Proposals/CopyrightFormat

That specification seems to be still very volatile, from what I can see
from the changelog.

 # control-file -- lintian check script -*- perl -*-
 #
 # Copyright (C) 2004 Marc Brockschmidt

?

   # Check that every file in the tree has a license
   my $command = 'cd unpacked  find . -type f -a -not \( -false';
   foreach my $section (@data) {
   next if not $section-{files};
   #patterns are comma separated
   #TODO: manage quoted strings with comma in it
   $command .= ' -o -path ./'.join(' -o -path ./', split 
 m/,\s/, $section-{files});
   
   }
   $command .= ' \) ; cd .. ';
   my $files_without_copyright = `$command`;
   if ($files_without_copyright) {
   foreach my $file (split '\n', $files_without_copyright) 
 {
   tag 'debian-copyright-file-without-copyright', 
 $file;
   }
   }

Hmm, there must be a better way to check that. I'm pretty sure we have a
list of files already available somewhere in the lintian working
directory. (If we haven't, we should)

   # Check that every pattern match something
   foreach my $section (@data) {
   #next section if this is not a files section
   next if not $section-{files};
   #files are comma separated
   #TODO: manage quoted strings (with comma in it)
   my @patterns = split m/,\s/, $section-{files} ;
   foreach my $pattern (@patterns) {
   if (not `ls -l  cd unpacked  find . -type f 
 -a -path $pattern ; cd ..`) {
   tag 
 'debian-copyright-section-without-match', $pattern ;
   }
   }
   }
   }

Same here.

 Tag: debian-copyright-unknown-field
 Type: warning
 Info: The package contains a copyright file that as an unknown field.

Typo s/as/has/

 Tag: debian-copyright-file-without-copyright
 Type: warning
 Info: The package contains a copyright file that does match the specified 
 file.

? That makes no sense.

 Tag: debian-copyright-section-without-match
 Type: warning
 Info: The package contains a copyright file which has a section which does
  match any file.

s/does/doesn't/ maybe?

   # pgp sig? - skip until end of signature
   elsif (m/^-BEGIN PGP SIGNATURE/) {
   while ($COPYRIGHT) {
   $line_number++;
   last if m/^-END PGP SIGNATURE/o;
   }
   }
   # other pgp control? - skip until the next blank line
   elsif (m/^-BEGIN PGP/) {
   while ($COPYRIGHT) {
   $line_number++;
   last if /^\s*$/o;
   }
   }

Since when can copyright files contain signatures?

   # new field?
   elsif (m/^(\S+):\s*(.*)$/o) {
   my ($tag,$value) = (lc $1,$2);
   #format-specification, files and notice always start a section
   if($tag =~ /format-specification|files|notice/i) {

You already make an lc on $tag, no need to make all the regexes
case-insensitive.

 # not used

So why is it included?

 sub _ensure_file_is_sane {

Please use the one from Util.pm

 my ($file) = @_;
 
 # if file exists and is not 0 bytes
 if (-f $file and -s $file) {
   return 1;
 }
 return 0;
 }
 
 # 
 
 sub fail {

Please use the one from Util.pm

 my $str = internal error;
 if (@_) {
   $str .= : .join( \n, @_).\n;
 } elsif ($!) {
   $str .= : $!\n;
 } else {
   $str .= .\n;
 }
 $! = 2; # set return code outside eval()
 die $str;

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]