bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Jim Meyering wrote: Jim Meyering wrote: Torbjorn Granlund wrote: The very old factoring code cut from an now obsolete version GMP does not pass proper arguments to the mpz_probab_prime_p function. It ask for 3 Miller-Rabin tests only, which is not sufficient. Hi Torbjorn Thank you for the patch and explanation. I've converted that into the commit below in your name. Please proofread it and let me know if you'd like to change anything. I tweaked the patch to change MR_REPS from a #define to an enum and to add the comment just preceding. I'll add NEWS and tests separately. ... From: Torbjorn Granlund t...@gmplib.org Date: Tue, 4 Sep 2012 16:22:47 +0200 Subject: [PATCH] factor: don't ever declare composites to be prime Torbjörn, I've just noticed that I misspelled your name above. Here's the NEWS/tests addition. Following is an adjusted commit that spells your name properly. From e561ff991b74dc19f6728aa1e6e61d1927055ac1 Mon Sep 17 00:00:00 2001 There have been enough changes (mostly typo fixes) that I'm re-posting these for review before I push. Also, I added this sentence to NEWS about the performance hit, too The fix makes factor somewhat slower (~25%) for ranges of consecutive numbers, and up to 8 times slower for some worst-case individual numbers. From 68cf62bb04ecd138c81b68539c2a065250ca4390 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torbj=C3=B6rn=20Granlund?= t...@gmplib.org Date: Tue, 4 Sep 2012 18:38:29 +0200 Subject: [PATCH 1/2] factor: don't ever declare composites to be prime The multiple-precision factoring code (with HAVE_GMP) was copied from a now-obsolete version of GMP that did not pass proper arguments to the mpz_probab_prime_p function. It makes that code perform no more than 3 Miller-Rabin tests only, which is not sufficient. A Miller-Rabin test will detect composites with at least a probability of 3/4. For a uniform random composite, the probability will actually be much higher. Or put another way, of the N-3 possible Miller-Rabin tests for checking the composite N, there is no number N for which more than (N-3)/4 of the tests will fail to detect the number as a composite. For most numbers N the number of false witnesses will be much, much lower. Problem numbers are of the form N=pq, p,q prime and (p-1)/(q-1) = s, where s is a small integer. (There are other problem forms too, involving 3 or more prime factors.) When s = 2, we get the 3/4 factor. It is easy to find numbers of that form that cause coreutils' factor to fail: 465658903 2242724851 6635692801 17709149503 17754345703 20889169003 42743470771 54890944111 72047131003 85862644003 98275842811 114654168091 117225546301 ... There are 9008992 composites of the form with s=2 below 2^64. With 3 Miller-Rabin tests, one would expect about 9008992/64 = 140766 to be invalidly recognized as primes in that range. * src/factor.c (MR_REPS): Define to 25. (factor_using_pollard_rho): Use MR_REPS, not 3. (print_factors_multi): Likewise. * THANKS.in: Remove my name, now that it will be automatically included in the generated THANKS file. --- THANKS.in| 1 - src/factor.c | 9 ++--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/THANKS.in b/THANKS.in index 1580151..2c3f83c 100644 --- a/THANKS.in +++ b/THANKS.in @@ -608,7 +608,6 @@ Tony Leneis t...@plaza.ds.adp.com Tony Robinson a...@eng.cam.ac.uk Toomas Soometoomas.so...@elion.ee Toralf Förster toralf.foers...@gmx.de -Torbjorn Granlund t...@nada.kth.se Torbjorn Lindgren t...@funcom.no Torsten Landschoff tors...@pclab.ifg.uni-kiel.de Travis Gummels tgumm...@redhat.com diff --git a/src/factor.c b/src/factor.c index 1d55805..e63e0e0 100644 --- a/src/factor.c +++ b/src/factor.c @@ -153,6 +153,9 @@ factor_using_division (mpz_t t, unsigned int limit) mpz_clear (r); } +/* The number of Miller-Rabin tests we require. */ +enum { MR_REPS = 25 }; + static void factor_using_pollard_rho (mpz_t n, int a_int) { @@ -222,7 +225,7 @@ S4: mpz_div (n, n, g); /* divide by g, before g is overwritten */ - if (!mpz_probab_prime_p (g, 3)) + if (!mpz_probab_prime_p (g, MR_REPS)) { do { @@ -242,7 +245,7 @@ S4: mpz_mod (x, x, n); mpz_mod (x1, x1, n); mpz_mod (y, y, n); - if (mpz_probab_prime_p (n, 3)) + if (mpz_probab_prime_p (n, MR_REPS)) { emit_factor (n); break; @@ -411,7 +414,7 @@ print_factors_multi (mpz_t t) if (mpz_cmp_ui (t, 1) != 0) { debug ([is number prime?] ); - if (mpz_probab_prime_p (t, 3)) + if (mpz_probab_prime_p (t, MR_REPS)) emit_factor (t); else factor_using_pollard_rho (t, 1); -- 1.7.12.176.g3fc0e4c From
bug#12365: closed (Re: Should cp -n return 0, when DEST exists?)
On Thu, Sep 6, 2012 at 8:03 PM, GNU bug Tracking System help-debb...@gnu.org wrote: Your bug report #12365: Incorrect return value of cp with no-clobber option which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 12...@debbugs.gnu.org. -- 12365: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=12365 GNU Bug Tracking System Contact help-debb...@gnu.org with problems -- Forwarded message -- From: Eric Blake ebl...@redhat.com To: Anoop Sharma sendtoan...@gmail.com Cc: 12365-d...@debbugs.gnu.org, coreutils coreut...@gnu.org Date: Thu, 06 Sep 2012 08:31:53 -0600 Subject: Re: Should cp -n return 0, when DEST exists? tag 12365 wontfix thanks On 09/06/2012 04:50 AM, Anoop Sharma wrote: When -n option of cp is used and the DEST file exists, then, as expected, cp is not able to copy the SOURCE to DEST. However, cp returns 0 in this case. cp -n is not mandated by POSIX, so we are free to do as we wish here. But looking at history, we added -n for coreutils 7.1 in Feb 2009, and the mail from that thread includes: https://lists.gnu.org/archive/html/bug-coreutils/2008-12/msg00159.html which states we are modeling after FreeBSD. A quick check on my FreeBSD 8.2 VM shows: $ echo one bar $ echo two blah $ cp -n blah bar $ echo $? 0 $ cat bar one that FreeBSD also returns 0 in this case, and I don't want to break interoperability. Therefore, I'm going to close this as a WONTFIX, unless you also get buy-in from the BSD folks. By the way, there's no need to post three separate emails with the same contents, without first waiting at least 24 hours. Like most other moderated GNU lists, you do not have to be a subscriber to post, and even if you are a subscriber, your first post to a given list will be held in a moderation queue for as long as it takes for a human to approve your email address as a non-spammer for all future posts (generally less than a day). -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org -- Forwarded message -- From: Anoop Sharma sendtoan...@gmail.com To: bug-coreutils@gnu.org Cc: Date: Thu, 6 Sep 2012 17:27:52 +0530 Subject: Incorrect return value of cp with no-clobber option When -n (--no-clobber) option of cp is used and the DEST file exists, then, as expected, cp is not able to copy the SOURCE to DEST. However, cp returns 0 in this case. Shouldn't it return 1 to indicate that copy operation could not be completed? In absence of this indication how is one to know that some recovery action like re-trying cp with some other DEST name is required? Regards, Anoop Thank you, Eric. I am a newbie to open source development tools and processes. I had posted earlier to bug-coreutils@gnu.org and had got an acknowledgement mail immediately. Subsequently, I subscribed to coreut...@gnu.org and have now been subscribed for more than a month. I originally posted this mail to that list for discussion. However, there was no acknowledgement from there and I mistakenly assumed that some spam filter is stopping my mails from reaching the list. Therefore, I tweaked the text a bit in an attempt to get past the spam filter and tried multiple times. Finally, as a work-around, I posted to bug-coreutils@gnu.org and stopped thereafter, because I got an ack again! I will be more patient next time! Thanks for educating, Anoop
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Paul Eggert egg...@cs.ucla.edu writes: On 09/06/2012 02:33 PM, Jim Meyering wrote: * We have some hardwired W_TYPE_SIZE settings for the code interfacing to longlong.h. It is now 64 bits. It will break on systems where uintmax_t is not a 64-bit type. Please see the beginning of factor.c. I wonder how many types of systems would be affected. It's only a matter of time. GCC already supports 128-bit integers on my everyday host (Fedora 17, x86-64, GCC 4.7.1). Eventually uintmax_t will grow past 64 bits, if only for the crypto guys. It should however be noted that uintmax_t stays at 64 bits even with GCC's 128-bit integers. I think the latter are declared as not being integers, or something along those lines, to avoid the ABI-breaking change of redefining uintmax_t. If the code needs exactly-64-bit unsigned integers, shouldn't it be using uint64_t? That's the standard way of doing that sort of thing. Gnulib can supply the type on pre-C99 platforms. Weird but standard-conforming platforms that don't have uint64_t will be out of luck, but surely they're out of luck anyway. The code does not need any particular size of uintmax_t, except that we need a preprocessor-time size measurement of it. The reason for this is longlong.h's tests of which single-line asm code to include. The new factor program works without longlong.h, but some parts of it will become 3-4 times slower. To disable longlong.h, please compile with -DUSE_LONGLONG_H=0. (The worst affected parts would probably be the single-word Lucas code and all double-word factoring.) I suppose that an autoconf test of the type size will be needed at least for theoretical portability, if longlong.h is to be retained. There is one other place where some (hypothetical) portability problems may exist, and that's make-prime-list.c. It prints a list of uintmax_t literals. We let the coreutils maintainers worry about the allowable complexity of the factor program; I and Niels are happy to sacrifice some speed for lowering the code complexity. But first we will increase it by retrofitting GMP factoring code. :o) -- Torbjörn
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Torbjorn Granlund wrote: ... * We have some hardwired W_TYPE_SIZE settings for the code interfacing to longlong.h. It is now 64 bits. It will break on systems where uintmax_t is not a 64-bit type. Please see the beginning of factor.c. I wonder how many types of systems would be affected. It is not used currently anywhere in coreutils? Perhaps coreutils could uintmax_t is used throughout coreutils, but nowhere (that comes to mind) does it fail when UINTMAX_MAX happens to be different than 2^64-1. What I was wondering is how many systems have a uintmax_t that is only 32 bits wide. Now that I reread, I suppose this code would be ok (albeit slower) with uintmax_t wider than 64. use autoconf for checking this? (If we're really crazy, we could speed the factor program by an additional 20% by using blocked input with e.g. fread.) Please take a look at the generated code for factor_using_division, towards the end where 8 imulq should be found (on amd64). The code uses mov, imul, cmp, jbe for testing the divisibility of a prime; the branch is taken when the prime divides the number being factored, thus highly non-taken. (I suppose we could do a better job at describing the maths, with some references. This particular trick is from Division by invariant integers using multiplication.) Any place you can add a reference would be most welcome. Here's one where I'd appreciate a reference in a comment: #define MAGIC64 ((uint64_t) 0x0202021202030213ULL) #define MAGIC63 ((uint64_t) 0x0402483012450293ULL) #define MAGIC65 ((uint64_t) 0x218a019866014613ULL) #define MAGIC11 0x23b /* Returns the square root if the input is a square, otherwise 0. */ static uintmax_t is_square (uintmax_t x) { /* Uses the tests suggested by Cohen. Excludes 99% of squares before computing the square root. */ if (((MAGIC64 (x 63)) 1) ((MAGIC63 (x % 63)) 1) /* Both 0 and 64 are squares mod (65) */ ((MAGIC65 ((x % 65) 63)) 1) ((MAGIC11 (x % 11) 1))) { uintmax_t r = isqrt (x); if (r*r == x) return r; } return 0; }
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Jim Meyering j...@meyering.net writes: The existing code can factor arbitrarily large numbers quickly, as long as they have no large prime factors. We should retain that capability. My understanding is that most gnu/linux distributions build coreutils without linking to gmp. So lots of users don't get this capability. If this is an important feature, maybe one should consider bundling mini-gmp and use that as a fallback in case coreutils is configured without gmp (see http://gmplib.org:8000/gmp/file/7677276bdf92/mini-gmp/README). I would expect it to be a constant factor (maybe 10) times slower than the real gmp for numbers up to a few hundred bits (for larger numbers, it gets much slower due to lack of sophisticated algorithms, but we probably can't factor them in reasonable time anyway). Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Torbjorn Granlund t...@gmplib.org writes: There is one other place where some (hypothetical) portability problems may exist, and that's make-prime-list.c. It prints a list of uintmax_t literals. I don't think the prime sieving is not a problem, but for each (odd) prime p, it also computes p^{-1} mod 2^{bits} and floor ( (2^{bits} - 1) / p), where bits is the size of an uintmax_t. This will break cross compilation, if uintmax_t is of different size on build and host system, or if different suffixes (U, UL, ULL) are needed in the generated primes.h. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
On 09/07/2012 09:43 AM, Niels Möller wrote: Jim Meyeringj...@meyering.net writes: The existing code can factor arbitrarily large numbers quickly, as long as they have no large prime factors. We should retain that capability. My understanding is that most gnu/linux distributions build coreutils without linking to gmp. So lots of users don't get this capability. If this is an important feature, maybe one should consider bundling mini-gmp and use that as a fallback in case coreutils is configured without gmp (see http://gmplib.org:8000/gmp/file/7677276bdf92/mini-gmp/README). I would expect it to be a constant factor (maybe 10) times slower than the real gmp for numbers up to a few hundred bits (for larger numbers, it gets much slower due to lack of sophisticated algorithms, but we probably can't factor them in reasonable time anyway). Bundling libraries is bad if one needed to update it. The correct approach here is to file a bug against your distro to enable gmp which is trivial matter of adding the build and runtime dependency on gmp. cheers, Pádraig.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
On 09/07/2012 07:19 AM, Jim Meyering wrote: There have been enough changes (mostly typo fixes) that I'm re-posting these for review before I push. Also, I added this sentence to NEWS about the performance hit, too The fix makes factor somewhat slower (~25%) for ranges of consecutive numbers, and up to 8 times slower for some worst-case individual numbers. Thanks for collating all the tweaks. +1 Pádraig.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Pádraig Brady p...@draigbrady.com writes: On 09/07/2012 09:43 AM, Niels Möller wrote: If this is an important feature, maybe one should consider bundling mini-gmp Bundling libraries is bad if one needed to update it. mini-gmp is not an ordinary library. It's a single portable C source file (currently around 4000 lines) implementing a subset of the GMP API, and with performance only a few times slower than the real thing, for small bignums. It's *intended* for bundling with applications, either for unconditional use, or for use as a fallback if the real gmp library is not available. It's never (I hope!) going to be installed in /usr/lib. To me, coreutil's factor seem to be close match for what it's intended for. That said, mini-gmp is pretty new (I wrote most of it around last Christmas) and I'm not aware of any application or library using it yet. I think the guile hackers are considering using it (for the benefit of applications which use guile as an extension language, but don't need high performance bignums). So if you decide to use it in coreutils, you'll be pioneers. It *is* used in the GMP build process, for precomputing various internal tables. Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
On 09/07/2012 11:35 AM, Niels Möller wrote: Pádraig Bradyp...@draigbrady.com writes: On 09/07/2012 09:43 AM, Niels Möller wrote: If this is an important feature, maybe one should consider bundling mini-gmp Bundling libraries is bad if one needed to update it. mini-gmp is not an ordinary library. It's a single portable C source file (currently around 4000 lines) implementing a subset of the GMP API, and with performance only a few times slower than the real thing, for small bignums. It's *intended* for bundling with applications, either for unconditional use, or for use as a fallback if the real gmp library is not available. It's never (I hope!) going to be installed in /usr/lib. To me, coreutil's factor seem to be close match for what it's intended for. That said, mini-gmp is pretty new (I wrote most of it around last Christmas) and I'm not aware of any application or library using it yet. I think the guile hackers are considering using it (for the benefit of applications which use guile as an extension language, but don't need high performance bignums). So if you decide to use it in coreutils, you'll be pioneers. It *is* used in the GMP build process, for precomputing various internal tables. I can see the need when bootstrapping, but I'd prefer if coreutils just relied on regular GMP. That said, I see there is some push back in debian on depending on GMP. Note expr from coreutils also uses GMP, which may sway the decision. thanks, Pádraig.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Jim Meyering wrote: Linda Walsh wrote: ... GNU needs to be clear their priorities -- maintaining software freedom, or bowing down to corporate powers... POSIX isn't While POSIX is in general a very good baseline, no one here conforms blindly. If POSIX is wrong, we'll lobby to change it, or, when that fails, maybe relegate the undesirable required behavior to when POSIXLY_CORRECT is set, or even simply ignore it. In fact, over the years, I have deliberately made a few GNU tools contravene some aspects of POSIX-specified behavior that I felt were counterproductive. We try to make the tools as useful as possible, sometimes adding features when we deem them worthwhile. However, we are very much against changing the *default* behavior (behavior that has been that way for over 20 years and that is compatible with all other vendor-supplied rm programs) without a very good reason. So if I make it enabled with an ENV var set to RM_FILES_DEPTH_FIRST, to enable the behavior, then you'd have no problem accepting the patch?
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
On 09/07/2012 08:16 AM, Linda Walsh wrote: We try to make the tools as useful as possible, sometimes adding features when we deem them worthwhile. However, we are very much against changing the *default* behavior (behavior that has been that way for over 20 years and that is compatible with all other vendor-supplied rm programs) without a very good reason. So if I make it enabled with an ENV var set to RM_FILES_DEPTH_FIRST, to enable the behavior, then you'd have no problem accepting the patch? I personally detest new env-vars that change long-standing behavior, because you then have to audit EVERY SINGLE SCRIPT to ensure that its use is unimpacted if the new env-var is set. It must either be an existing env-var, or my personal preference of a new --long-option. But if you want to submit a patch so that 'rm -r --depth-first .' does what you want, I'm probably 60-40 in favor of including it. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Eric Blake wrote: I personally detest new env-vars that change long-standing behavior, because you then have to audit EVERY SINGLE SCRIPT to ensure that its use is unimpacted if the new env-var is set. It must either be an existing env-var, or my personal preference of a new --long-option. But if you want to submit a patch so that 'rm -r --depth-first .' does what you want, I'm probably 60-40 in favor of including it. --- I wouldn't be opposed to adding it in addition, but I don't want the extra typing for what is the more common case for me, but given that the current behavior is to return an error -- and there is an expectation of being able to type in non-working commands just to see the error message -- imagine their surprise and how they would curse if you added an option that actually made that previously illegal action, work. Most of them who type in random wrong commands just to see error messages aren't smart enough to use environment variables.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
On 09/07/2012 03:35 AM, Niels Möller wrote: It's *intended* for bundling with applications, either for unconditional use, or for use as a fallback if the real gmp library is not available. I've been looking for something like that for Emacs, since I want Emacs to use bignums. Do you think it'd be suitable? One hassle I have with combining Emacs and GMP is that Emacs wants to control how memory is allocated, and wants its memory allocator to longjmp out if memory gets low, and GMP is documented to not support that. If the mini-gmp library doesn't have this problem I'm thinking that Emacs might use it *instead* of GMP.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Bob Proulx wrote: Because I originally voted that this felt like a bug I wanted to state that after determining that this has already been legacy system historical practice for a very long time that I wouldn't change it now. Portability of applications is more important. Right now, the feature is unused, So hurting compatibility is not an issue - users can push for the feature on other systems as needed. It certainly isnt' a safety issue, since rm ** on a bouncy keyboard is alot easier to type than rm -fr *, and the former will remove all files under the current dir (just none of the directories) I suppose rm **;rmdir ** would work -- but but require SHELL. I think adding the case of rm -fr . or (dirname/.) to delete contents of the dir, but not the dir itself makes more sense and is safer than the easier to type rm ** This isn't a feature that could be working in a script for someone. It isn't something that was recently removed that would cause a script to break. A script will run now with the same behavior across multiple different types of systems. I think we should leave things unchanged. It's only been recently that I've noticed rm -fr . not working and I can't figure out why since it hasn't been around for so long. Consider the parallel, if I want to make sure I copy the contents of a dir, I need to use cp dir/. dest/ If I use dir/ or dir, they both end up in dest (even with the /). That means without using ., the contents are not addressable, so what is demanded by 'cp', is refused by 'rm'. That is not a consistent user interface and is symptomatic of poor design. Using . to reference content of a dir is standard in other utils -- that it doesn't work in 'rm' goes counter to the idea of how rm works -- you have to remove contents before trying the current dir. It isn't logical to think that it would try the current dir before anything else -- as it goes completely contrary to how rm has to work. I say it's a design flaw and inconsistent with other programs. But if I can set an env var and have it work on my system, someone else can work to get the core utils to work consistently...
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Eric Blake wrote: Then set up a shell alias or a wrapper script that comes first in your $PATH. Then it is under your explicit control, while the default is still appropriate for everyone else. Just because the defaults don't match your expectations doesn't mean you can't change the behavior _on your system_ to avoid extra typing on your part. --- Doesn't work for programs that need to call rm to remove all files in a dir. And I already have changed default behavior on MY systems. I already added the patch below -- that only does my behavior if the user isn't running in POSIXLY_CORRECT mode. Now it's certainly easier to set an env var 1 place to control script behavior than making changes in all the places... I'm just trying to get the env var in so I don't have to distribute a version of rm with my scripts that works. - --- src/remove.c2011-10-10 00:56:46.0 -0700 +++ src/remove.c.new2012-09-06 14:28:07.816810683 -0700 @@ -173,6 +173,35 @@ } } + +/* separate functions to check for next part of file name being dotdot or...*/ + +static inline bool +dotdot (char const *file_name) +{ + if (file_name[0] == '.' file_name[1]) +{ + char sep = file_name[(file_name[1] == '.') + 1]; + return (! sep || ISSLASH (sep)); +} + else +return false; +} + +/* dot */ + +static inline bool +dot (char const * file_name) +{ + if (file_name[0] == '.') + { + char sep = file_name[1]; + return (! sep || ISSLASH(sep)); + } + else + return false; +} + /* Prompt whether to remove FILENAME (ent-, if required via a combination of the options specified by X and/or file attributes. If the file may be removed, return RM_OK. If the user declines to remove the file, @@ -203,6 +232,7 @@ int dirent_type = is_dir ? DT_DIR : DT_UNKNOWN; int write_protected = 0; + int special_delete_content = 0; /* When nonzero, this indicates that we failed to remove a child entry, either because the user declined an interactive prompt, or due to @@ -222,7 +252,11 @@ wp_errno = errno; } - if (write_protected || x-interactive == RMI_ALWAYS) + if (!x-posix_correctly dot(filename) !x-force) + special_delete_content = 1; + + + if (write_protected || x-interactive == RMI_ALWAYS || special_delete_content) { if (0 = write_protected dirent_type == DT_UNKNOWN) { @@ -281,11 +315,16 @@ if (dirent_type == DT_DIR mode == PA_DESCEND_INTO_DIR !is_empty) -fprintf (stderr, - (write_protected - ? _(%s: descend into write-protected directory %s? ) - : _(%s: descend into directory %s? )), - program_name, quoted_name); + { + char * action = special_delete_content + ? _(delete contents of) + : _(descend into); + fprintf (stderr, + (write_protected + ? _(%s: %s write-protected directory %s? ) + : _(%s: %s directory %s? )), +action, program_name, quoted_name); + } else { if (cache_fstatat (fd_cwd, filename, sbuf, AT_SYMLINK_NOFOLLOW) != 0) @@ -476,7 +515,8 @@ /* If the basename of a command line argument is . or .., diagnose it and do nothing more with that argument. */ - if (dot_or_dotdot (last_component (ent-fts_accpath))) + if ( (x-posix_correctly ? dot_or_dotdot : dotdot) + (last_component (ent-fts_accpath))) { error (0, 0, _(cannot remove directory: %s), quote (ent-fts_path)); --- src/remove.h2011-07-28 03:38:27.0 -0700 +++ src/remove.h2012-09-06 13:33:01.282362765 -0700 @@ -34,6 +34,14 @@ /* If true, ignore nonexistent files. */ bool ignore_missing_files; + /* true if force (-f) was specified indicating user knows what they +* are doing and don't want to questioned or see errors from command */ + bool force; + + /* true for users wanting strict posix compliance of more flexible, lax, +* or useful behaviors */ + bool posix_correctly; + /* If true, query the user about whether to remove each file. */ enum rm_interactive interactive; --- src/rm.c2011-10-02 02:20:54.0 -0700 +++ src/rm.c2012-09-06 13:33:04.132500554 -0700 @@ -206,6
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
On 09/07/2012 08:54 AM, Linda Walsh wrote: Using . to reference content of a dir is standard in other utils -- that it doesn't work in 'rm' goes counter to the idea of how rm works -- you have to remove contents before trying the current dir. It isn't logical to think that it would try the current dir before anything else -- as it goes completely contrary to how rm has to work. At the syscall level, unlink(.) is required to fail. To remove a directory, you must remove its proper name. You can use unlink(../child) on systems like Linux that let you remove a directory that is used as a process' current working directory (on systems like Windows where this action is forbidden, there's no way to remove the current working directory). Therefore, at the shell level, POSIX will let you do 'rm -r ../child'. If you think that POSIX should _also_ let you attempt 'rm -r .', then propose that as a defect report against POSIX, rather than griping here. I say it's a design flaw and inconsistent with other programs. I would say that it is not a design flaw, but that it is consistent with the fact that the unlink(.) syscall is required to fail, and that it is consistent with other Unix implementations. We can agree to disagree on that point. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12366: [gnu-prog-discuss] bug#12366: Writing unwritable files
Il 06/09/2012 19:23, Paul Eggert ha scritto: The file replacement is atomic. The reading of the file is not. Sure, but the point is that from the end user's point of view, 'sed -i' is not atomic, and can't be expected to be atomic. Atomic file replacement is what matters for security. Paolo
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
Jim Meyering j...@meyering.net writes: uintmax_t is used throughout coreutils, but nowhere (that comes to mind) does it fail when UINTMAX_MAX happens to be different than 2^64-1. What I was wondering is how many systems have a uintmax_t that is only 32 bits wide. Now that I reread, I suppose this code would be ok (albeit slower) with uintmax_t wider than 64. The code with work with longlong.h iff W_TYPE_SIZE is defined to the bitsize of uintmax_t. Any place you can add a reference would be most welcome. I have added comments here and there. More comments might be desirable. Here's one where I'd appreciate a reference in a comment: #define MAGIC64 ((uint64_t) 0x0202021202030213ULL) #define MAGIC63 ((uint64_t) 0x0402483012450293ULL) #define MAGIC65 ((uint64_t) 0x218a019866014613ULL) #define MAGIC11 0x23b I added a comment explaining these constants. Here is a new version of the code. It now has GMP factoring code, updated from the GMP demos code. nt-factor-002.tar.lz Description: Binary data -- Torbjörn
bug#12366: [gnu-prog-discuss] bug#12366: Writing unwritable files
On 09/07/2012 09:38 AM, Paolo Bonzini wrote: Atomic file replacement is what matters for security. Unfortunately, 'sed's use of atomic file replacement does not suffice for security. For example, suppose sysadmins (mistakenly) followed the practice of using 'sed -i' to remove users from /etc/passwd. And suppose there are two misbehaving users moe and larry, and two sysadmins bonzini and eggert. bonzini discovers that moe's misbehaving, and types: sed -i '/^moe:/d' /etc/passwd and thinks, Great! moe can't log in any more. Similarly eggert discovers that larry's misbehaving, and types: sed -i '/^larry:/d' /etc/passwd and thinks, All right! I've done my job too. Unfortunately, it could be that moe can still log in afterwards. Or maybe larry can. We don't know, because 'sed -i' is not atomic, which means /etc/passwd might contain moe afterwards, or maybe larry. Of course one could wrap 'sed -i' inside a larger script, that arranges for atomicity at the end-user level. But the same is true for 'sort -o'. Perhaps the method of 'sed -i' buys the user *something*, but whatever that something is, isn't immediately obvious. When it comes to security mechanisms, simplicity and clarity are critical, and unfortunately 'sed -i' has problems in this area, just as 'sort -o' does.
bug#12366: [gnu-prog-discuss] bug#12366: Writing unwritable files
Paul Eggert wrote: Paolo Bonzini wrote: Atomic file replacement is what matters for security. Unfortunately, 'sed's use of atomic file replacement does not suffice for security. For example, suppose sysadmins (mistakenly) followed the practice of using 'sed -i' to remove users from /etc/passwd. And suppose there are two misbehaving users moe and larry, and two sysadmins bonzini and eggert. bonzini discovers that moe's misbehaving, and types: sed -i '/^moe:/d' /etc/passwd Using /etc/passwd isn't a good example because system convention dictates that a /etc/passwd.lock must be observed for any edits there specifically for the problem you are illustrating. The above would not be correct even if sed were fully atomic overall. Of course one could wrap 'sed -i' inside a larger script, that arranges for atomicity at the end-user level. Right. The 'vipw' script for example. :-) [I have abused the EDITOR variable for that purpose many times. Set it to either an inline script or to a real script and use it to safely edit these types of files. More with 'visudo' though.] Bob
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Eric Blake wrote: On 09/07/2012 08:54 AM, Linda Walsh wrote: Using . to reference content of a dir is standard in other utils -- that it doesn't work in 'rm' goes counter to the idea of how rm works -- you have to remove contents before trying the current dir. It isn't logical to think that it would try the current dir before anything else -- as it goes completely contrary to how rm has to work. At the syscall level, unlink(.) is required to fail. To remove a directory, you must remove its proper name. You can use unlink(../child) on systems like Linux that let you remove a directory that is used as a process' current working directory (on systems like Windows where this action is forbidden, there's no way to remove the current working directory). Therefore, at the shell level, POSIX will let you do 'rm -r ../child'. If you think that POSIX should _also_ let you attempt 'rm -r .', then propose that as a defect report against POSIX, rather than griping here. I say it's a design flaw and inconsistent with other programs. I would say that it is not a design flaw, but that it is consistent with the fact that the unlink(.) syscall is required to fail, and that it is consistent with other Unix implementations. We can agree to disagree on that point. Really, I didn't say rm -fr . should *delete* the current directory -- IT SHOULD FAIL -- you are 100% correct. But it is true that anyone who knows the smallest bit about unix knows that you have to empty the directory before deleting the directory, and, thus, rm _MUST_ do a depth first traversal. If it did and gave an error at the end: no issue. It's the special check BEFORE doing the work it would normally do, and failing BEFORE, it does it's NORMAL work -- the depth first deletion, that I am against. Griping against POSIX is like griping against the government. But very few people always go the speed limit and would regard a vehicle that is *unable* function normally, as faulty. So I am not disagreeing that it should fail, Please be clear about what I am asking. Also, I would *expect*, that rm -r . would at least, _ask_ you if you wanted to remove files under a directory that will be unable to be deleted. I am only asking for the behavior I describe to work without issuing an error when I do rm -fr I am specifically asking rm to forcefully remove what it can and remove and CONTINUE to delete what it can in spite of any errors it might encounter. Again, the fact that this fails defies normal logic with 'rm'. I don't believe that rm -fr . has failed for 20 years. I don't know when it changed, but it used to be that rm didn't have a special check for . -- BECAUSE -- as you mention, an attempt to unlink . will fail -- Using the -f suppresses any error message.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
On 09/07/2012 02:56 PM, Linda Walsh wrote: Really, I didn't say rm -fr . should *delete* the current directory -- IT SHOULD FAIL -- you are 100% correct. But it is true that anyone who knows the smallest bit about unix knows that you have to empty the directory before deleting the directory, and, thus, rm _MUST_ do a depth first traversal. If it did and gave an error at the end: no issue. Indeed, reading the original V7 source code from 1979: http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/rm.c while(--argc 0) { if(!strcmp(*++argv, ..)) { fprintf(stderr, rm: cannot remove `..'\n); continue; } rm(*argv, fflg, rflg, iflg, 0); } shows that _only_ .. was special, . was attempted in-place and didn't fail until the unlink(.) after the directory itself had been emptied. It wasn't until later versions of code that . also became special. You therefore may have a valid point that POSIX standardized something that did not match existing practice at the time, and therefore, it would be reasonable to propose a POSIX defect that requires early failure on .., but changes the behavior on . and / to only permit, but not require, early failure. However, I just checked, and the prohibition for an early exit on . has been around since at least POSIX 2001, so you are now coming into the game at least 11 years late. So, until you take it up with the POSIX folks, I don't think anyone on the coreutils side cares enough to bother changing the default behavior, now that it has been standardized, and even though the standardized behavior is tighter than the original pre-standard behavior. Griping against POSIX is like griping against the government. No, I actually find the Austin Group quite reasonable to work with, especially if you can provide backup evidence like the V7 source snippet I just mentioned. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Eric Blake wrote: You therefore may have a valid point that POSIX standardized something that did not match existing practice at the time, and therefore, it would be reasonable to propose a POSIX defect that requires early failure on .., but changes the behavior on . and / to only permit, but not require, early failure. However, I just checked, and the prohibition for an early exit on . has been around since at least POSIX 2001, so you are now coming into the game at least 11 years late. Those changes only started hitting the field a few years ago. Bash just started working to adopted the 2003 standard with it's 4.0 version -- before that it was 1999 -- I didn't even know there was a 2001 Except that trying to get them to change things now, I'd encounter the same arguments I get here -- that users expect to be able have -f not really mean force -- and to report errors on .. Not that I believe that, -- I just think most users aren't aware or don't care, but that would be the reasoning. I get it here, why would I expect someone who's job is to come up with lame rules that defy standard practice (last I looked they were proposing to ban space (as well as 0x01-0x1f) in file names). Attempting to deal with people who want to turn POSIX into a restriction document -- not a standard reflecting current implementations, is well beyond my social abilities. I can't even get engineers -- when faced with clear evidence of programs that put out inconsistent output to fix them. They know it's bad output -- and even warn that they are about to do the wrong thing in warnings. Somehow this is considered preferable to doing something useful. So expecting a group that is heavily into bureaucracy to listen to reason just doesn't seem like a reasonable expectation. I did go to their website though and see what they were discussing, and when I saw that sentiment was going in favor of limiting allowed characters in filenames, I was to ill to stay.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
Eric Blake writes: Indeed, reading the original V7 source code from 1979: http://minnie.tuhs.org/cgi-bin/utree.pl?file=3DV7/usr/src/cmd/rm.c [...] shows that _only_ .. was special, . was attempted in-place and didn't fail until the unlink(.) after the directory itself had been emptied. It wasn't until later versions of code that . also became special. I also decided to look around there, and found some of the turning points: Up to 4.2BSD, the V7 behavior was kept. (http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/bin/rm.c) rm -rf . was forbidden in 4.3BSD (26 years ago). http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD/usr/src/bin/rm.c The removal of dir/. (and dir/..) was not forbidden until Reno. http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/bin/rm/rm.c cp = rindex(arg, '/'); if (cp == NULL) cp = arg; else ++cp; if (isdot(cp)) { fprintf(stderr, rm: cannot remove `.' or `..'\n); return (0); } Maybe the classical behavior stuck around longer in the more SysV-ish Unices. The Ultrix-11 3.1 tree on TUHS from 1988 has a rm that looks very much like V7, but I can't find anything to compare it to until OpenSolaris. Did POSIX force BSD to change their rm in 1988? I think it's more likely that POSIX simply documents a restriction that BSD had already added. Either way the latest POSIX revisions certainly can't be blamed. -- Alan Curry
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
On 09/07/2012 03:30 PM, Linda Walsh wrote: Not that I believe that, -- I just think most users aren't aware or don't care, but that would be the reasoning. I get it here, why would I expect someone who's job is to come up with lame rules that defy standard practice (last I looked they were proposing to ban space (as well as 0x01-0x1f) in file names). You aren't looking very hard, then. The proposal currently being considered by the Austin Group is a ban on newline (and newline only) from file names, because that is the one and only character whose presence in file names causes ambiguous output for line-oriented tools. Forbidding space and most non-printing control characters was rejected as impractical. And even with the proposed ban on newline, it is still just that - a proposal, and not a hard rule, waiting for implementation practice to see if it is even doable. http://www.austingroupbugs.net/view.php?id=251 Read the whole thing. The original poster mentioned a much tighter bound, but it was shot down, with the _only_ thing being left on the table under current discussion is _just_ the limitation of newline. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
On 09/07/2012 03:20 PM, Eric Blake wrote: Indeed, reading the original V7 source code from 1979: http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/rm.c shows that _only_ .. was special, . was attempted in-place and didn't fail until the unlink(.) after the directory itself had been emptied. It wasn't until later versions of code that . also became special. You therefore may have a valid point that POSIX standardized something that did not match existing practice at the time, and therefore, it would be reasonable to propose a POSIX defect that requires early failure on .., but changes the behavior on . and / to only permit, but not require, early failure. However, I just checked, and the prohibition for an early exit on . has been around since at least POSIX 2001, so you are now coming into the game at least 11 years late. In addition to Alan's argument that 4.3BSD forbade '.' before POSIX began (and therefore the POSIX folks DID standardize existing practice, even it wasn't universally common at the time), I find this statement from POSIX quite informative (line 104265 in POSIX 2008), on why any proposal to allow 'rm -rf .' to remove non-dot files will probably be denied: The rm utility is forbidden to remove the names dot and dot-dot in order to avoid the consequences of inadvertently doing something like: rm −r .* -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Eric Blake wrote: The rm utility is forbidden to remove the names dot and dot-dot in order to avoid the consequences of inadvertently doing something like: rm −r .* --- Which is why, IMO, I thought rm -r .* should ask if they really want to remove all files under . as the first question, as it would show up first in such a situation. As stated before, I am more interested in the -f=force it anyway option, that says to let it fail, and continue, ignoring failure. I think that may be where the problem has been introduced. I never used rm - . Certainly rm ** is easier to mistype than rm -r .* so by that logic, that should be disallowed as well? I submit it is the behavior of -f that has changed -- and that it used to mean force -- continue in spite of errors, and it is that behavior that has changed, as I would would always have expected rm -r . to at least return some error I didn't care about -- What I wanted was the depth-first removal, and -f to force it to continue despite errors. How long has -f NOT meant --force -- as now it only overlooks write protection errors which sounds very weak.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
On 09/07/2012 09:02 AM, Linda Walsh wrote: --- src/remove.c2011-10-10 00:56:46.0 -0700 +++ src/remove.c.new2012-09-06 14:28:07.816810683 -0700 Thanks for making an attempt to show what you want in code. However, you provided no ChangeLog entry, no mention in NEWS and no documentation. Also, you do not have copyright assignment on file with the FSF (but if you'd like to pursue this patch further, we can help you complete the copyright assignment paperwork). Therefore, this patch cannot be taken as-is. @@ -203,6 +232,7 @@ int dirent_type = is_dir ? DT_DIR : DT_UNKNOWN; int write_protected = 0; + int special_delete_content = 0; Furthermore, your indentation appears hideous in this email; I'm not sure you created the patch, and whether this is an artifact of your mailer corrupting things or whether you really did disregard existing indentation, but you'd have to clean that up before your patch can be anywhere worth including. + char * action = special_delete_content + ? _(delete contents of) + : _(descend into); + fprintf (stderr, + (write_protected + ? _(%s: %s write-protected directory %s? ) + : _(%s: %s directory %s? )), This is a translation no-no (not to mention that your hideous indentation made it hard to read because it was so much longer than 80 columns). Please don't split English sentences across two separate _() calls that are then pasted together, but rather write two _() calls of the two complete sentences. +++ src/rm.c2012-09-06 13:33:04.132500554 -0700 @@ -206,6 +206,7 @@ bool preserve_root = true; struct rm_options x; bool prompt_once = false; + x.posix_correctly = (getenv (POSIXLY_CORRECT) != NULL ); Elsewhere in coreutils, we name such a variable posixly_correct, not posix_correctly. And finally, remember my advice - if you want this mode, add it as a new long option, and NOT as an abuse of POSIXLY_CORRECT, if you want to avoid controversy and even stand a chance of getting it approved for inclusion. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
Alan Curry wrote: Eric Blake writes: if (isdot(cp)) { fprintf(stderr, rm: cannot remove `.' or `..'\n); return (0); --- The thing is, by doing rm -rf on ., I am not trying to remove . or .. I'm trying to remove the files in it. Other wise there is no way to specify, using rm to delete the contents of a directory but not the directory itself. I just want to clean out a directory -- I don't want to try to delete the directory itself. 4.3BSD is breaking the intended functionality of a depth first *requirement* for rm to remove a dir. It's well known that the contents must be removed before ., so trying to remove . should only fail after the depth first traversal -- that would be expected behavior. I'm NOT trying to remove . I'm using it to indicated that I only want the contents of the directory deleted. It's like using / with a symlink to get at the dir, except that doesn't work reliably -- the only thing that works reliably to get at the contents is . Try cp|mv -r dir1/ dir2/ -- they'll move dir1 under dir2, the same as without the -r. The only way for -r to address content is by using . as the source, in cp -a src/. dst[/] Note cp -a src/. dst/. works, but try to mv, and you get the type of error you'd expect from trying to destroy b/. Note the copy works but not the mv. cp should realize it is writing . to . which is illegal -- but it allows it because it knows what the user *wants* to do. 'mv', is stricter -- as you want to move one inode over the top of another. As a source address, . is allowed to mean content, but as a destination, it isn't allowed as a writeable destination. That's the problem with rm, when uing rm -r '.' that specifies the start of a recursive operation where writes will begin, depth first. It's not until the very end that it tries to do the illegal write (delete) operation to .. If 'cp' allows . as a source, (and even accommodates it as a target for cp, so should 'rm' allow . to mean the start of a deletion, but NOT the dir itself. That's the interpretation that cp is using -- as if it was trying to cp the dir itself over the top of the target, it would give an error message, but it doesn't.
bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)
I found a problem with the GMP integration. We have a 100 byte buffer in the stdin reading code, which was adequate before we used GMP, but now one might want to attempt to factor much larger numbers. We'll fix that, but not tonight. -- Torbjörn
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.
Eric Blake wrote: ...codeing stuff... Thanks for the advice... will take it appreciatively, however it was a few hours effort in unfamiliar code. I certainly wouldn't write NEWS/CHANGES if I didn't have an initial agreement that it would go in. More to the point. Others are objecting, (I'm willing to admit some reasonability in the objection) to changing the default behavior. I proposed adding a ENV var that would need to be specified to get the new behavior. Thus it *would not* be changing default behavior. Does everyone get that... as that's been offered as a an acceptable compromise. Vs. the option of adding it as a long option -- that's pushing it too far -- and doesn't work for me. as it's easier for me to maintain and distribute a patch to rm or my own version than it is to have it as a long option. Reason: doing it in rm OR as an ENV var does it in one place and all my code/interactivity benefits . Doing it in a long option -- must be paid for on each use. The cost doesn't work for me nor would it work for anyone considering cost v. benefit.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
On 09/07/2012 06:02 PM, Linda Walsh wrote: The thing is, by doing rm -rf on ., I am not trying to remove . or .. I'm trying to remove the files in it. Other wise there is no way to specify, using rm to delete the contents of a directory but not the directory itself. Yes there is, and Paul already told it to you: rm -rf * .[!.] .??* I just want to clean out a directory -- I don't want to try to delete the directory itself. Then use the triple-glob. This is portable to both POSIX and to the old implementations we have been discussing. -- Eric Blake ebl...@redhat.com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
Eric Blake wrote: On 09/07/2012 06:02 PM, Linda Walsh wrote: The thing is, by doing rm -rf on ., I am not trying to remove . or .. I'm trying to remove the files in it. Other wise there is no way to specify, using rm to delete the contents of a directory but not the directory itself. Yes there is, and Paul already told it to you: rm -rf * .[!.] .??* You must have missed that rm doesn't expand shell globs... and I don't want to get the shell involved for rm'ing files anymore than cp needs to to copy directories or the files in a dir and not the dir.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
On 09/07/2012 06:25 PM, Linda Walsh wrote: I don't want to get the shell involved That's not a reasonable constraint. The shell is a standard tool for solving this sort of problem, and involving the shell solves this problem.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
The shell is one of the things I'm trying not to have a dependency on. It doesn't pass a reliability test as it does too much. I want a utility that removes files -- single files or depth recursive and it can fail on the current dir or target dir -- after finishes like it is documented to do .. or it can fail at the beginning, as long as the -f option said to ignore errors and keep going. I don't want a wildcard solution. It's a wildcard. (Doh!) The issue was changing the default no? You don't think I'm being reasonable by agreeing and saying a non default environment var? Why should cp accept . as addressing the contents of a directory, but rm be deliberately crippled? We've excluded safety, since no one will get to the option unless they've enabled it and unless they choose force, since the they should get a prompt asking if they want to delete the delete the files in protected directory ., right? So far no one has addressed when the change in -f' went in NOT to ignore the non-deletable dir . and continue recursive delete, as normal behavior would have it do. Posix claims the rational is to prevent accidental deletion of all cur-dir when using rm -r ., however if it is queried in that case, and noting that rm ** is just as dangerous, but not as clean as it 'only' deletes all files, and leaves the dir skeleton. So posix's rationals wouldn't seem to be 1) that important, and 2) would be addressed by sticking with the current behavior or prompting first -- which would make more sense rather than arbitrarily deciding for them and removing the ability for rm to remove just files -- by itself. with no other utilities invoked. Why shouldn't rm have the ability to only target everything under a specified point, rather than always including the point?
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
Linda Walsh writes: So far no one has addressed when the change in -f' went in NOT to ignore the non-deletable dir . and continue recursive delete, In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the -f option is not consulted before rejecting removal of . so I don't think the change you're referring to is a change at all. -f never had the effect you think it should have. -- Alan Curry
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
Alan Curry wrote: Linda Walsh writes: So far no one has addressed when the change in -f' went in NOT to ignore the non-deletable dir . and continue recursive delete, In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the -f option is not consulted before rejecting removal of . so I don't think the change you're referring to is a change at all. -f never had the effect you think it should have. If I was using BSD, I would agree. --- But most of my usage has been on SysV compats Solaris, SGI, Linux, a short while on SunOS back in the late 80's, but that would have been before it changed anyway. For all i know it could have been a vendor addin, but that's not the whole point here. Do you want to support making . illegal for all gnu utils for addressing content? If not, then we should look at making a decision that it can be used to address content and ensure the interface is consistent going forward. I think you'll find many more people against the idea and wondering why it's in 'rm' and why -f doesn't really mean ignore all the errors it can and why that one should be specially treated. Of course they also might wonder why rm doesn't follow the necessary algorithm for deleting files -- and delete contents before dying issuing an error for being unable to delete a parent. Which might also raise why -f shouldn't be usable to silence permission or access errors as it was designed to. There are plenty of good reasons aside from BSD historic usage why it should be designed in, especially when it's being tucked away as a non-default behavior that would need environmental triggering to even be available.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
On 09/07/2012 08:06 PM, Linda Walsh wrote: The shell is one of the things I'm trying not to have a dependency on. That sounds unnecessarily impractical. It's been decades since I used a system that had 'rm' but didn't have a shell that could solve this problem easily. By the way, Alan, that was a nice trip down memory lane! I've *used* all those 'rm' implementations. I'm even old enough to remember when 'rm' automatically refused to remove itself. You don't think I'm being reasonable by agreeing and saying a non default environment var? No, because then currently-working scripts might have to be changed to guard against that variable being set. Clearly you don't agree with the POSIX rationale. Reasonable people can disagree, and then move on. The nice thing about free software is that you can build your system the way you like. You don't have to convert the rest of us.
bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it
Linda Walsh writes: Alan Curry wrote: Linda Walsh writes: So far no one has addressed when the change in -f' went in NOT to ignore the non-deletable dir . and continue recursive delete, In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the -f option is not consulted before rejecting removal of . so I don't think the change you're referring to is a change at all. -f never had the effect you think it should have. If I was using BSD, I would agree. --- But most of my usage has been on SysV compats Solaris, SGI, Linux, a short while on SunOS back in the late 80's, but that would have been before it changed anyway. SGI is dead, Sun is dead, the game's over, we're the winners, and our rm has been this way forever. For all i know it could have been a vendor addin, but that's not the whole point here. Do you want to support making . illegal for all gnu utils for addressing content? I don't think addressing content is a clearly defined operation, no matter how many times you repeat it. Consistency between tools is a good thing, but consistency between OSes is also good, and we'd be losing that if any change was made to GNU rm's default behavior. Even OpenSolaris has the restriction: see lines 160-170 of http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/rm/rm.c I think you'll find many more people against the idea and wondering why it's in 'rm' and why -f doesn't really mean ignore all the errors it can and why that one should be specially treated. Of course they also might wonder why rm doesn't follow the necessary algorithm for deleting files -- and delete contents before dying issuing an error for being unable to delete a parent. Which might also raise why -f shouldn't be usable to silence permission or access errors as it was designed to. Look, I agree isn't not logical or elegant. But we have a standard that all current Unices are obeying, and logic and elegance alone aren't enough to justify changing that. A new option that you can put in an alias is really the most realistic goal. -- Alan Curry