Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-09-06 Thread Andreas Tille
Hi Nicolas,

On Wed, Sep 05, 2012 at 01:17:47AM +0200, Nicolas Boulenguez wrote:
 
 diff --git a/scripts/uscan.pl b/scripts/uscan.pl
 index 649f822..34e31a9 100755
 --- a/scripts/uscan.pl
 +++ b/scripts/uscan.pl
 @@ -1494,17 +1494,9 @@ EOF
  print STDERR Error: $main_source_dir is no directory;
  }
  my $nfiles_before = `find $main_source_dir | wc -l`;
 -foreach (grep {/\//} split /\s+/, $data-{files-excluded}) {
 -# delete trailing '/' because otherwise find -path will fail
 -s?/+$?? ;
 -# use rm -rf to enable deleting non-empty directories
 +foreach (split /\s+/, $data-{files-excluded}) {
  `find $main_source_dir -path $main_source_dir/$_ -print0 
 | xargs -0 rm -rf`;
  };
 -foreach (grep {/^[^\/]+$/} split /\s+/, 
 $data-{files-excluded}) {
 -`find $main_source_dir -type f -name $_ -delete`;
 -# the statement above does not delete directories - just do 
 it now
 -`rm -rf $main_source_dir/$_ ` if ( -d 
 $main_source_dir/$_ ) ;
 -};
  my $nfiles_after = `find $main_source_dir | wc -l`;
  if ( $nfiles_before == $nfiles_after  ! $exclude__MACOSX ) {
  print -- Source tree remains identical - no need for 
 repacking.\n if $verbose;

I tested this patch and I have a problem with

   
svn://svn.debian.org/svn/debian-med/trunk/packages/rdp-classifier/trunk/debian

It specifies:

Files-Excluded:
__MACOSX
[a-z]*.jar

with the purpose to save ReadSeq.jar inside the source package.  This works 
with the
old method:

$ find . -name [a-z]*.jar
./rdp_classifier_2.5/lib/junit.jar
./rdp_classifier_2.5/lib/commons-cli.jar
./rdp_classifier_2.5/rdp_classifier-2.5.jar

but failes when trying your patch:

$ find . -path ./[a-z]*.jar
./rdp_classifier_2.5/lib/junit.jar
./rdp_classifier_2.5/lib/commons-cli.jar
./rdp_classifier_2.5/lib/ReadSeq.jar
./rdp_classifier_2.5/rdp_classifier-2.5.jar

I admit I did also not followed the DEP5 discussion very closely but the
current code could deal nicely with the specific removal which is not
the case with your proposal and I have no clue how to reasonably specify
the fact that all *.jars except one should be removed (besides
specifying every single file).

Kind regards

Andreas.

-- 
http://fam-tille.de


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120906203438.gc13...@an3as.eu



Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-09-05 Thread Jon Dowland
On Wed, Sep 05, 2012 at 09:04:19AM +0900, Charles Plessy wrote:
 the machine-readable format does not mention trailing slashes at the end
 of directory names, and refers to the -path test of the GNU find command,

Good. Having a trailing-slash be meaningful is very confusing. I especially
hate this with rsync, where I now use trailing-slash on dirs exclusively,
since I've memorized the behaviour when you do this and not when you don't.
It frustrates me because foo and foo/ might differ but they are both names
for the same thing - and the behaviour should not differ if the name for
the same thing differs IMHO.

 which will fail with trailing slashes.

I can understand why it would fail, since the argument to path is a pattern
rather than a filename, and it is compared against find's list of paths
which just so happen not to have trailing slashes. Having said that I wonder
if this behaviour could be considered buggy.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120905105227.GA31962@debian



Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-09-04 Thread Andreas Tille
On Fri, Aug 31, 2012 at 11:38:15AM +0200, Jonas Smedegaard wrote:
 
 Hmm.  Seems my head was stock in a too old draft of DEP-5 - or 
 completely off track.  Sorry!
 
 I'll step back and let others figure out wat is proper interpretation.

You might like to check my last commit to

git://git.debian.org/git/users/tille/devscripts.git

and check whether it matches your expectations when injecting
Files-Excluded field into debian/copyright.

You might also like to test --repack-compression {gz,bz2,xz,lzma}.

Kind regards

   Andreas.

-- 
http://fam-tille.de


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120904171421.gb23...@an3as.eu



Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-09-04 Thread Nicolas Boulenguez
On Tue, Sep 04, 2012 at 07:14:21PM +0200, Andreas Tille wrote:

 You might like to check my last commit to
 git://git.debian.org/git/users/tille/devscripts.git

(currently 4b3a4a6310ff1ff80ac1498cf92a99817c75ffce)

 and check whether it matches your expectations when injecting
 Files-Excluded field into debian/copyright.

At first view, your changes correct
- foo should match foo, even if a directory.
but do not correct
- foo should not match bar/foo, even if a file.
- foo/ should never match, even if foo is a directory.

DEP5 does not distinguish files from directories, or whether paths
contain a slash. Why should you do these tests in your code? I suggest
the following patch.

diff --git a/scripts/uscan.pl b/scripts/uscan.pl
index 649f822..34e31a9 100755
--- a/scripts/uscan.pl
+++ b/scripts/uscan.pl
@@ -1494,17 +1494,9 @@ EOF
 print STDERR Error: $main_source_dir is no directory;
 }
 my $nfiles_before = `find $main_source_dir | wc -l`;
-foreach (grep {/\//} split /\s+/, $data-{files-excluded}) {
-# delete trailing '/' because otherwise find -path will fail
-s?/+$?? ;
-# use rm -rf to enable deleting non-empty directories
+foreach (split /\s+/, $data-{files-excluded}) {
 `find $main_source_dir -path $main_source_dir/$_ -print0 | 
xargs -0 rm -rf`;
 };
-foreach (grep {/^[^\/]+$/} split /\s+/, $data-{files-excluded}) 
{
-`find $main_source_dir -type f -name $_ -delete`;
-# the statement above does not delete directories - just do it 
now
-`rm -rf $main_source_dir/$_ ` if ( -d $main_source_dir/$_ 
) ;
-};
 my $nfiles_after = `find $main_source_dir | wc -l`;
 if ( $nfiles_before == $nfiles_after  ! $exclude__MACOSX ) {
 print -- Source tree remains identical - no need for 
repacking.\n if $verbose;


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120904231747.GA3868@pegase



Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-09-04 Thread Charles Plessy
Le Wed, Sep 05, 2012 at 01:17:47AM +0200, Nicolas Boulenguez a écrit :
 
 - foo should match foo, even if a directory.
 but do not correct
 - foo should not match bar/foo, even if a file.
 - foo/ should never match, even if foo is a directory.
 
 DEP5 does not distinguish files from directories, or whether paths
 contain a slash. Why should you do these tests in your code? I suggest
 the following patch.

Hello everybody,

the machine-readable format does not mention trailing slashes at the end
of directory names, and refers to the -path test of the GNU find command,
which will fail with trailing slashes.

Also, out of the 1,129 machine-readable files version 1.0 present in
the packages-metadata archive (http://wiki.debian.org/UpstreamMetadata),
only four have a trailing slash after a directory name:

$ for file in $(grep 
http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ */*.copyright 
-H | cut -f 1 -d ':' ) ; do grep-dctrl --eregex -FFiles -sFiles '/$'  $file 2 
/dev/null  echo $file; done
Files: modules/nginx-dav-ext-module/
n/nginx.copyright
Files: signpost-core/src/main/java/com/
o/oauth-signpost.copyright
Files: game/LuaUI/
s/spring.copyright
Files: lib/yard/rubygems/backports/
y/yard.copyright

(No Files field matched '/ ')

Andreas, I think that you can consider such entries a bug and output a warning
when finding them, or rejecting the whole file as invalid.

If this is unclear in the current specification, it can be clarified in the
next revision.

Have a nice day,

-- 
Charles Plessy
Tsurumi, Kanagawa, Japan


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120905000419.ga1...@falafel.plessy.net



Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-08-31 Thread Jonas Smedegaard
On 12-08-31 at 03:38am, Nicolas Boulenguez wrote:
 (apologizes for the previous empty mail)
 
 On Thu, Aug 30, 2012 at 11:44:34PM +0200, Andreas Tille wrote:
  On Thu, Aug 30, 2012 at 02:32:56AM +0200, Nicolas Boulenguez wrote:
 
   Assume that a and b are directories, if I understand well, the 
   current behaviour is to recursively remove a/b/, a/b and a/ 
   but to ignore a. I do not find that intuitive, and would suggest 
   to document it clearly.
 
  Could you suggest a wording which fits the requirement of clearly 
  documented.
 
 Here is the meaning of Files-Excluded patterns, as I understand from 
 the perl code at 
 http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493
 
  If $pattern contains no slash, then
   (A) execute `find $main_source_dir -type f -name $pattern -delete`
That is, remove all *files* in all subdirectories whose *base name* match.
  Else
   (B) remove the trailing slash from $pattern, if present
   (C) execute `find $main_source_dir -path $main_source_dir/$pattern
-print0 | xargs -0 rm -rf`
That is, remove all *files or subdirectories* whose *full path* match.
 
 Jonas says that this imitates Files patterns, but
 http://dep.debian.net/deps/dep5/#files-field mentions
 
  Patterns match pathnames that start at the root of the source tree.
  Thus, Makefile.in matches only the file at the root of the tree,
  but */Makefile.in matches at any depth.
 
 I understand that matching depends only on the existence of the path,
 not on it being a file or a directory. So
 - foo should match foo, even if a directory.
 - foo should not match bar/foo, even if a file.
 - foo/ should never match, even if foo is a directory.
 In short, I would only expect (C) as the implementation.
 
  It is required that there is distinction between files and
  directories (see the discussion on debian-devel, for instance [1] and
  [2]).
  [1] http://lists.debian.org/debian-devel/2012/08/msg00512.html
  [2] http://lists.debian.org/debian-devel/2012/08/msg00449.html
 
 In [2], Jonas says I believe it is better to..., but does not
 explain why.
 
   If I would feed 'a' to the removal algorithm it could simply
  happen, that a file c/a would be removed as well but it should not.
 
 With (C), the a pattern will not remove the c/a file.

Hmm.  Seems my head was stock in a too old draft of DEP-5 - or 
completely off track.  Sorry!

I'll step back and let others figure out wat is proper interpretation.


 - Jonas

-- 
 * Jonas Smedegaard - idealist  Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: Digital signature


Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-08-30 Thread Jonas Smedegaard
On 12-08-30 at 11:44pm, Andreas Tille wrote:
 I repost some extract from some private discussion about the 
 Files-Excluded enhancement of uscan where Nicolas Boulenguez found 
 some issues.  (Nicolas, I hope you don't mind if I quote some of your 
 non-private remarks in public.)
 
 Nicolas problem is mainly that if you specify
 
 Files-Excluded: foo
 
 and foo is a directory nothing is removed because the algorithm
 requires to specify rather
 
 Files-Excluded: foo/
 
 I wonder what you might think about this.

My proposal was and still is to keep things simple by reusing the syntax 
for Files paragraphs, which has this distinction between files and 
directories.

Yes, even if that is not simple in every way.


Regards,

 - Jonas

-- 
 * Jonas Smedegaard - idealist  Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private


signature.asc
Description: Digital signature


Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)

2012-08-30 Thread Nicolas Boulenguez
(apologizes for the previous empty mail)

On Thu, Aug 30, 2012 at 11:44:34PM +0200, Andreas Tille wrote:
 On Thu, Aug 30, 2012 at 02:32:56AM +0200, Nicolas Boulenguez wrote:

  Assume that a and b are directories, if I understand well, the
  current behaviour is to recursively remove a/b/, a/b and a/ but
  to ignore a. I do not find that intuitive, and would suggest to
  document it clearly.

 Could you suggest a wording which fits the requirement of clearly
 documented.

Here is the meaning of Files-Excluded patterns, as I understand from
the perl code at
http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493

 If $pattern contains no slash, then
  (A) execute `find $main_source_dir -type f -name $pattern -delete`
   That is, remove all *files* in all subdirectories whose *base name* match.
 Else
  (B) remove the trailing slash from $pattern, if present
  (C) execute `find $main_source_dir -path $main_source_dir/$pattern
   -print0 | xargs -0 rm -rf`
   That is, remove all *files or subdirectories* whose *full path* match.

Jonas says that this imitates Files patterns, but
http://dep.debian.net/deps/dep5/#files-field mentions

 Patterns match pathnames that start at the root of the source tree.
 Thus, Makefile.in matches only the file at the root of the tree,
 but */Makefile.in matches at any depth.

I understand that matching depends only on the existence of the path,
not on it being a file or a directory. So
- foo should match foo, even if a directory.
- foo should not match bar/foo, even if a file.
- foo/ should never match, even if foo is a directory.
In short, I would only expect (C) as the implementation.

 It is required that there is distinction between files and
 directories (see the discussion on debian-devel, for instance [1] and
 [2]).
 [1] http://lists.debian.org/debian-devel/2012/08/msg00512.html
 [2] http://lists.debian.org/debian-devel/2012/08/msg00449.html

In [2], Jonas says I believe it is better to..., but does not
explain why.

  If I would feed 'a' to the removal algorithm it could simply
 happen, that a file c/a would be removed as well but it should not.

With (C), the a pattern will not remove the c/a file.

 In this case you can not specify something like
 *.jar
 for jar files hanging around in lib/*.jar or so.  The code which tries
 to distinguish between files and pathes enables removing files in all
 directories.

With (C), the *.jar pattern will remove lib/foo.jar and usr/lib/bar.jar.

--

This suggestion is unrelated, but I repeat is because it was hidden in
the huge quoted mail. If we keep the current meaning, I suggest that
the implementation at
http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493
avoid splitting and grepping twice.

foreach (split /\s+/, $data-{files-excluded}) {
if (grep {/\//}) {
# delete trailing '/' because otherwise find -path will fail
s?/+$?? ;
# use rm -rf to enable deleting non-empty directories
`find $main_source_dir -path $main_source_dir/$_ -print0 | 
xargs -0 rm -rf`;
} else {
`find $main_source_dir -type f -name $_ -delete`;
};
};


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20120831013853.GA7999@pegase