Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
Hi Nicolas, On Wed, Sep 05, 2012 at 01:17:47AM +0200, Nicolas Boulenguez wrote: diff --git a/scripts/uscan.pl b/scripts/uscan.pl index 649f822..34e31a9 100755 --- a/scripts/uscan.pl +++ b/scripts/uscan.pl @@ -1494,17 +1494,9 @@ EOF print STDERR Error: $main_source_dir is no directory; } my $nfiles_before = `find $main_source_dir | wc -l`; -foreach (grep {/\//} split /\s+/, $data-{files-excluded}) { -# delete trailing '/' because otherwise find -path will fail -s?/+$?? ; -# use rm -rf to enable deleting non-empty directories +foreach (split /\s+/, $data-{files-excluded}) { `find $main_source_dir -path $main_source_dir/$_ -print0 | xargs -0 rm -rf`; }; -foreach (grep {/^[^\/]+$/} split /\s+/, $data-{files-excluded}) { -`find $main_source_dir -type f -name $_ -delete`; -# the statement above does not delete directories - just do it now -`rm -rf $main_source_dir/$_ ` if ( -d $main_source_dir/$_ ) ; -}; my $nfiles_after = `find $main_source_dir | wc -l`; if ( $nfiles_before == $nfiles_after ! $exclude__MACOSX ) { print -- Source tree remains identical - no need for repacking.\n if $verbose; I tested this patch and I have a problem with svn://svn.debian.org/svn/debian-med/trunk/packages/rdp-classifier/trunk/debian It specifies: Files-Excluded: __MACOSX [a-z]*.jar with the purpose to save ReadSeq.jar inside the source package. This works with the old method: $ find . -name [a-z]*.jar ./rdp_classifier_2.5/lib/junit.jar ./rdp_classifier_2.5/lib/commons-cli.jar ./rdp_classifier_2.5/rdp_classifier-2.5.jar but failes when trying your patch: $ find . -path ./[a-z]*.jar ./rdp_classifier_2.5/lib/junit.jar ./rdp_classifier_2.5/lib/commons-cli.jar ./rdp_classifier_2.5/lib/ReadSeq.jar ./rdp_classifier_2.5/rdp_classifier-2.5.jar I admit I did also not followed the DEP5 discussion very closely but the current code could deal nicely with the specific removal which is not the case with your proposal and I have no clue how to reasonably specify the fact that all *.jars except one should be removed (besides specifying every single file). Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120906203438.gc13...@an3as.eu
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
On Wed, Sep 05, 2012 at 09:04:19AM +0900, Charles Plessy wrote: the machine-readable format does not mention trailing slashes at the end of directory names, and refers to the -path test of the GNU find command, Good. Having a trailing-slash be meaningful is very confusing. I especially hate this with rsync, where I now use trailing-slash on dirs exclusively, since I've memorized the behaviour when you do this and not when you don't. It frustrates me because foo and foo/ might differ but they are both names for the same thing - and the behaviour should not differ if the name for the same thing differs IMHO. which will fail with trailing slashes. I can understand why it would fail, since the argument to path is a pattern rather than a filename, and it is compared against find's list of paths which just so happen not to have trailing slashes. Having said that I wonder if this behaviour could be considered buggy. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120905105227.GA31962@debian
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
On Fri, Aug 31, 2012 at 11:38:15AM +0200, Jonas Smedegaard wrote: Hmm. Seems my head was stock in a too old draft of DEP-5 - or completely off track. Sorry! I'll step back and let others figure out wat is proper interpretation. You might like to check my last commit to git://git.debian.org/git/users/tille/devscripts.git and check whether it matches your expectations when injecting Files-Excluded field into debian/copyright. You might also like to test --repack-compression {gz,bz2,xz,lzma}. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120904171421.gb23...@an3as.eu
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
On Tue, Sep 04, 2012 at 07:14:21PM +0200, Andreas Tille wrote: You might like to check my last commit to git://git.debian.org/git/users/tille/devscripts.git (currently 4b3a4a6310ff1ff80ac1498cf92a99817c75ffce) and check whether it matches your expectations when injecting Files-Excluded field into debian/copyright. At first view, your changes correct - foo should match foo, even if a directory. but do not correct - foo should not match bar/foo, even if a file. - foo/ should never match, even if foo is a directory. DEP5 does not distinguish files from directories, or whether paths contain a slash. Why should you do these tests in your code? I suggest the following patch. diff --git a/scripts/uscan.pl b/scripts/uscan.pl index 649f822..34e31a9 100755 --- a/scripts/uscan.pl +++ b/scripts/uscan.pl @@ -1494,17 +1494,9 @@ EOF print STDERR Error: $main_source_dir is no directory; } my $nfiles_before = `find $main_source_dir | wc -l`; -foreach (grep {/\//} split /\s+/, $data-{files-excluded}) { -# delete trailing '/' because otherwise find -path will fail -s?/+$?? ; -# use rm -rf to enable deleting non-empty directories +foreach (split /\s+/, $data-{files-excluded}) { `find $main_source_dir -path $main_source_dir/$_ -print0 | xargs -0 rm -rf`; }; -foreach (grep {/^[^\/]+$/} split /\s+/, $data-{files-excluded}) { -`find $main_source_dir -type f -name $_ -delete`; -# the statement above does not delete directories - just do it now -`rm -rf $main_source_dir/$_ ` if ( -d $main_source_dir/$_ ) ; -}; my $nfiles_after = `find $main_source_dir | wc -l`; if ( $nfiles_before == $nfiles_after ! $exclude__MACOSX ) { print -- Source tree remains identical - no need for repacking.\n if $verbose; -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120904231747.GA3868@pegase
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
Le Wed, Sep 05, 2012 at 01:17:47AM +0200, Nicolas Boulenguez a écrit : - foo should match foo, even if a directory. but do not correct - foo should not match bar/foo, even if a file. - foo/ should never match, even if foo is a directory. DEP5 does not distinguish files from directories, or whether paths contain a slash. Why should you do these tests in your code? I suggest the following patch. Hello everybody, the machine-readable format does not mention trailing slashes at the end of directory names, and refers to the -path test of the GNU find command, which will fail with trailing slashes. Also, out of the 1,129 machine-readable files version 1.0 present in the packages-metadata archive (http://wiki.debian.org/UpstreamMetadata), only four have a trailing slash after a directory name: $ for file in $(grep http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ */*.copyright -H | cut -f 1 -d ':' ) ; do grep-dctrl --eregex -FFiles -sFiles '/$' $file 2 /dev/null echo $file; done Files: modules/nginx-dav-ext-module/ n/nginx.copyright Files: signpost-core/src/main/java/com/ o/oauth-signpost.copyright Files: game/LuaUI/ s/spring.copyright Files: lib/yard/rubygems/backports/ y/yard.copyright (No Files field matched '/ ') Andreas, I think that you can consider such entries a bug and output a warning when finding them, or rejecting the whole file as invalid. If this is unclear in the current specification, it can be clarified in the next revision. Have a nice day, -- Charles Plessy Tsurumi, Kanagawa, Japan -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120905000419.ga1...@falafel.plessy.net
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
On 12-08-31 at 03:38am, Nicolas Boulenguez wrote: (apologizes for the previous empty mail) On Thu, Aug 30, 2012 at 11:44:34PM +0200, Andreas Tille wrote: On Thu, Aug 30, 2012 at 02:32:56AM +0200, Nicolas Boulenguez wrote: Assume that a and b are directories, if I understand well, the current behaviour is to recursively remove a/b/, a/b and a/ but to ignore a. I do not find that intuitive, and would suggest to document it clearly. Could you suggest a wording which fits the requirement of clearly documented. Here is the meaning of Files-Excluded patterns, as I understand from the perl code at http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493 If $pattern contains no slash, then (A) execute `find $main_source_dir -type f -name $pattern -delete` That is, remove all *files* in all subdirectories whose *base name* match. Else (B) remove the trailing slash from $pattern, if present (C) execute `find $main_source_dir -path $main_source_dir/$pattern -print0 | xargs -0 rm -rf` That is, remove all *files or subdirectories* whose *full path* match. Jonas says that this imitates Files patterns, but http://dep.debian.net/deps/dep5/#files-field mentions Patterns match pathnames that start at the root of the source tree. Thus, Makefile.in matches only the file at the root of the tree, but */Makefile.in matches at any depth. I understand that matching depends only on the existence of the path, not on it being a file or a directory. So - foo should match foo, even if a directory. - foo should not match bar/foo, even if a file. - foo/ should never match, even if foo is a directory. In short, I would only expect (C) as the implementation. It is required that there is distinction between files and directories (see the discussion on debian-devel, for instance [1] and [2]). [1] http://lists.debian.org/debian-devel/2012/08/msg00512.html [2] http://lists.debian.org/debian-devel/2012/08/msg00449.html In [2], Jonas says I believe it is better to..., but does not explain why. If I would feed 'a' to the removal algorithm it could simply happen, that a file c/a would be removed as well but it should not. With (C), the a pattern will not remove the c/a file. Hmm. Seems my head was stock in a too old draft of DEP-5 - or completely off track. Sorry! I'll step back and let others figure out wat is proper interpretation. - Jonas -- * Jonas Smedegaard - idealist Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: Digital signature
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
On 12-08-30 at 11:44pm, Andreas Tille wrote: I repost some extract from some private discussion about the Files-Excluded enhancement of uscan where Nicolas Boulenguez found some issues. (Nicolas, I hope you don't mind if I quote some of your non-private remarks in public.) Nicolas problem is mainly that if you specify Files-Excluded: foo and foo is a directory nothing is removed because the algorithm requires to specify rather Files-Excluded: foo/ I wonder what you might think about this. My proposal was and still is to keep things simple by reusing the syntax for Files paragraphs, which has this distinction between files and directories. Yes, even if that is not simple in every way. Regards, - Jonas -- * Jonas Smedegaard - idealist Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: Digital signature
Re: Discussion of uscan enhancement 1 (Was: uscan enhancement take 3: script hook)
(apologizes for the previous empty mail) On Thu, Aug 30, 2012 at 11:44:34PM +0200, Andreas Tille wrote: On Thu, Aug 30, 2012 at 02:32:56AM +0200, Nicolas Boulenguez wrote: Assume that a and b are directories, if I understand well, the current behaviour is to recursively remove a/b/, a/b and a/ but to ignore a. I do not find that intuitive, and would suggest to document it clearly. Could you suggest a wording which fits the requirement of clearly documented. Here is the meaning of Files-Excluded patterns, as I understand from the perl code at http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493 If $pattern contains no slash, then (A) execute `find $main_source_dir -type f -name $pattern -delete` That is, remove all *files* in all subdirectories whose *base name* match. Else (B) remove the trailing slash from $pattern, if present (C) execute `find $main_source_dir -path $main_source_dir/$pattern -print0 | xargs -0 rm -rf` That is, remove all *files or subdirectories* whose *full path* match. Jonas says that this imitates Files patterns, but http://dep.debian.net/deps/dep5/#files-field mentions Patterns match pathnames that start at the root of the source tree. Thus, Makefile.in matches only the file at the root of the tree, but */Makefile.in matches at any depth. I understand that matching depends only on the existence of the path, not on it being a file or a directory. So - foo should match foo, even if a directory. - foo should not match bar/foo, even if a file. - foo/ should never match, even if foo is a directory. In short, I would only expect (C) as the implementation. It is required that there is distinction between files and directories (see the discussion on debian-devel, for instance [1] and [2]). [1] http://lists.debian.org/debian-devel/2012/08/msg00512.html [2] http://lists.debian.org/debian-devel/2012/08/msg00449.html In [2], Jonas says I believe it is better to..., but does not explain why. If I would feed 'a' to the removal algorithm it could simply happen, that a file c/a would be removed as well but it should not. With (C), the a pattern will not remove the c/a file. In this case you can not specify something like *.jar for jar files hanging around in lib/*.jar or so. The code which tries to distinguish between files and pathes enables removing files in all directories. With (C), the *.jar pattern will remove lib/foo.jar and usr/lib/bar.jar. -- This suggestion is unrelated, but I repeat is because it was hidden in the huge quoted mail. If we keep the current meaning, I suggest that the implementation at http://anonscm.debian.org/gitweb/?p=users/tille/devscripts.git;a=blob;f=scripts/uscan.pl;hb=bb686d032543d8ba8c5cd9c36ff8c2d9c3310761#l1493 avoid splitting and grepping twice. foreach (split /\s+/, $data-{files-excluded}) { if (grep {/\//}) { # delete trailing '/' because otherwise find -path will fail s?/+$?? ; # use rm -rf to enable deleting non-empty directories `find $main_source_dir -path $main_source_dir/$_ -print0 | xargs -0 rm -rf`; } else { `find $main_source_dir -type f -name $_ -delete`; }; }; -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120831013853.GA7999@pegase