bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Jim Meyering
Jim Meyering wrote:
 Jim Meyering wrote:

 Torbjorn Granlund wrote:
 The very old factoring code cut from an now obsolete version GMP does
 not pass proper arguments to the mpz_probab_prime_p function.  It ask
 for 3 Miller-Rabin tests only, which is not sufficient.

 Hi Torbjorn

 Thank you for the patch and explanation.
 I've converted that into the commit below in your name.
 Please proofread it and let me know if you'd like to change anything.
 I tweaked the patch to change MR_REPS from a #define to an enum
 and to add the comment just preceding.

 I'll add NEWS and tests separately.
 ...
 From: Torbjorn Granlund t...@gmplib.org
 Date: Tue, 4 Sep 2012 16:22:47 +0200
 Subject: [PATCH] factor: don't ever declare composites to be prime

 Torbjörn, I've just noticed that I misspelled your name above.

 Here's the NEWS/tests addition.
 Following is an adjusted commit that spells your name properly.

From e561ff991b74dc19f6728aa1e6e61d1927055ac1 Mon Sep 17 00:00:00 2001

There have been enough changes (mostly typo fixes) that I'm re-posting
these for review before I push.  Also, I added this sentence to NEWS
about the performance hit, too

The fix makes factor somewhat slower (~25%) for ranges of consecutive
numbers, and up to 8 times slower for some worst-case individual numbers.


From 68cf62bb04ecd138c81b68539c2a065250ca4390 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Torbj=C3=B6rn=20Granlund?= t...@gmplib.org
Date: Tue, 4 Sep 2012 18:38:29 +0200
Subject: [PATCH 1/2] factor: don't ever declare composites to be prime

The multiple-precision factoring code (with HAVE_GMP) was copied from
a now-obsolete version of GMP that did not pass proper arguments to
the mpz_probab_prime_p function.  It makes that code perform no more
than 3 Miller-Rabin tests only, which is not sufficient.

A Miller-Rabin test will detect composites with at least a probability
of 3/4.  For a uniform random composite, the probability will actually
be much higher.

Or put another way, of the N-3 possible Miller-Rabin tests for checking
the composite N, there is no number N for which more than (N-3)/4 of the
tests will fail to detect the number as a composite.  For most numbers N
the number of false witnesses will be much, much lower.

Problem numbers are of the form N=pq, p,q prime and (p-1)/(q-1) = s,
where s is a small integer.  (There are other problem forms too,
involving 3 or more prime factors.)  When s = 2, we get the 3/4 factor.

It is easy to find numbers of that form that cause coreutils' factor to
fail:

  465658903
  2242724851
  6635692801
  17709149503
  17754345703
  20889169003
  42743470771
  54890944111
  72047131003
  85862644003
  98275842811
  114654168091
  117225546301
  ...

There are 9008992 composites of the form with s=2 below 2^64.  With 3
Miller-Rabin tests, one would expect about 9008992/64 = 140766 to be
invalidly recognized as primes in that range.

* src/factor.c (MR_REPS): Define to 25.
(factor_using_pollard_rho): Use MR_REPS, not 3.
(print_factors_multi): Likewise.
* THANKS.in: Remove my name, now that it will be automatically
included in the generated THANKS file.
---
 THANKS.in| 1 -
 src/factor.c | 9 ++---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/THANKS.in b/THANKS.in
index 1580151..2c3f83c 100644
--- a/THANKS.in
+++ b/THANKS.in
@@ -608,7 +608,6 @@ Tony Leneis t...@plaza.ds.adp.com
 Tony Robinson   a...@eng.cam.ac.uk
 Toomas Soometoomas.so...@elion.ee
 Toralf Förster  toralf.foers...@gmx.de
-Torbjorn Granlund   t...@nada.kth.se
 Torbjorn Lindgren   t...@funcom.no
 Torsten Landschoff  tors...@pclab.ifg.uni-kiel.de
 Travis Gummels  tgumm...@redhat.com
diff --git a/src/factor.c b/src/factor.c
index 1d55805..e63e0e0 100644
--- a/src/factor.c
+++ b/src/factor.c
@@ -153,6 +153,9 @@ factor_using_division (mpz_t t, unsigned int limit)
   mpz_clear (r);
 }

+/* The number of Miller-Rabin tests we require.  */
+enum { MR_REPS = 25 };
+
 static void
 factor_using_pollard_rho (mpz_t n, int a_int)
 {
@@ -222,7 +225,7 @@ S4:

   mpz_div (n, n, g);   /* divide by g, before g is overwritten */

-  if (!mpz_probab_prime_p (g, 3))
+  if (!mpz_probab_prime_p (g, MR_REPS))
 {
   do
 {
@@ -242,7 +245,7 @@ S4:
   mpz_mod (x, x, n);
   mpz_mod (x1, x1, n);
   mpz_mod (y, y, n);
-  if (mpz_probab_prime_p (n, 3))
+  if (mpz_probab_prime_p (n, MR_REPS))
 {
   emit_factor (n);
   break;
@@ -411,7 +414,7 @@ print_factors_multi (mpz_t t)
   if (mpz_cmp_ui (t, 1) != 0)
 {
   debug ([is number prime?] );
-  if (mpz_probab_prime_p (t, 3))
+  if (mpz_probab_prime_p (t, MR_REPS))
 emit_factor (t);
   else
 factor_using_pollard_rho (t, 1);
--
1.7.12.176.g3fc0e4c


From 

bug#12365: closed (Re: Should cp -n return 0, when DEST exists?)

2012-09-07 Thread Anoop Sharma
On Thu, Sep 6, 2012 at 8:03 PM, GNU bug Tracking System
help-debb...@gnu.org wrote:
 Your bug report

 #12365: Incorrect return value of cp with no-clobber option

 which was filed against the coreutils package, has been closed.

 The explanation is attached below, along with your original report.
 If you require more details, please reply to 12...@debbugs.gnu.org.

 --
 12365: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=12365
 GNU Bug Tracking System
 Contact help-debb...@gnu.org with problems


 -- Forwarded message --
 From: Eric Blake ebl...@redhat.com
 To: Anoop Sharma sendtoan...@gmail.com
 Cc: 12365-d...@debbugs.gnu.org, coreutils coreut...@gnu.org
 Date: Thu, 06 Sep 2012 08:31:53 -0600
 Subject: Re: Should cp -n return 0, when DEST exists?
 tag 12365 wontfix
 thanks

 On 09/06/2012 04:50 AM, Anoop Sharma wrote:
 When -n option of cp is used and the DEST file exists, then, as
 expected, cp is not able to copy the SOURCE to DEST.

 However, cp returns 0 in this case.

 cp -n is not mandated by POSIX, so we are free to do as we wish here.
 But looking at history, we added -n for coreutils 7.1 in Feb 2009, and
 the mail from that thread includes:
 https://lists.gnu.org/archive/html/bug-coreutils/2008-12/msg00159.html

 which states we are modeling after FreeBSD.  A quick check on my FreeBSD
 8.2 VM shows:

 $ echo one  bar
 $ echo two  blah
 $ cp -n blah bar
 $ echo $?
 0
 $ cat bar
 one

 that FreeBSD also returns 0 in this case, and I don't want to break
 interoperability.  Therefore, I'm going to close this as a WONTFIX,
 unless you also get buy-in from the BSD folks.

 By the way, there's no need to post three separate emails with the same
 contents, without first waiting at least 24 hours.  Like most other
 moderated GNU lists, you do not have to be a subscriber to post, and
 even if you are a subscriber, your first post to a given list will be
 held in a moderation queue for as long as it takes for a human to
 approve your email address as a non-spammer for all future posts
 (generally less than a day).

 --
 Eric Blake   ebl...@redhat.com+1-919-301-3266
 Libvirt virtualization library http://libvirt.org



 -- Forwarded message --
 From: Anoop Sharma sendtoan...@gmail.com
 To: bug-coreutils@gnu.org
 Cc:
 Date: Thu, 6 Sep 2012 17:27:52 +0530
 Subject: Incorrect return value of cp with no-clobber option
 When -n (--no-clobber) option of cp is used and the DEST file exists, then, as
 expected, cp is not able to copy the SOURCE to DEST.

 However, cp returns 0 in this case.

 Shouldn't it return 1 to indicate that copy operation could not be completed?

 In absence of this indication how is one to know that some recovery
 action like re-trying cp with some other DEST name is required?

 Regards,
 Anoop





Thank you, Eric.

I am a newbie to open source development tools and processes. I had
posted earlier to bug-coreutils@gnu.org and had got an acknowledgement
mail immediately.

Subsequently, I subscribed to coreut...@gnu.org and have now been
subscribed for more than a month. I originally posted this mail to
that list for discussion. However, there was no acknowledgement from
there and I mistakenly assumed that some spam filter is stopping my
mails from reaching the list. Therefore, I tweaked the text a bit in
an attempt to get past the spam filter and tried multiple times.
Finally, as a work-around, I posted to bug-coreutils@gnu.org and
stopped thereafter, because I got an ack again!

I will be more patient next time!

Thanks for educating,
Anoop





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Torbjorn Granlund
Paul Eggert egg...@cs.ucla.edu writes:

  On 09/06/2012 02:33 PM, Jim Meyering wrote:
* We have some hardwired W_TYPE_SIZE settings for the code interfacing
  to longlong.h.  It is now 64 bits.  It will break on systems where
  uintmax_t is not a 64-bit type.  Please see the beginning of
  factor.c.
   I wonder how many types of systems would be affected.
  
  It's only a matter of time.  GCC already supports 128-bit
  integers on my everyday host (Fedora 17, x86-64, GCC 4.7.1).
  Eventually uintmax_t will grow past 64 bits, if only for the
  crypto guys.
  
It should however be noted that uintmax_t stays at 64 bits even with
GCC's 128-bit integers.  I think the latter are declared as not being
integers, or something along those lines, to avoid the ABI-breaking
change of redefining uintmax_t.

  If the code needs exactly-64-bit unsigned integers, shouldn't
  it be using uint64_t?  That's the standard way of doing
  that sort of thing.  Gnulib can supply the type on pre-C99
  platforms.  Weird but standard-conforming platforms that
  don't have uint64_t will be out of luck, but surely they're out
  of luck anyway.
  
The code does not need any particular size of uintmax_t, except that we
need a preprocessor-time size measurement of it.  The reason for this is
longlong.h's tests of which single-line asm code to include.

The new factor program works without longlong.h, but some parts of it
will become 3-4 times slower.  To disable longlong.h, please compile
with -DUSE_LONGLONG_H=0. (The worst affected parts would probably be the
single-word Lucas code and all double-word factoring.)

I suppose that an autoconf test of the type size will be needed at least
for theoretical portability, if longlong.h is to be retained.

There is one other place where some (hypothetical) portability problems
may exist, and that's make-prime-list.c.  It prints a list of uintmax_t
literals.

We let the coreutils maintainers worry about the allowable complexity of
the factor program; I and Niels are happy to sacrifice some speed for
lowering the code complexity.  But first we will increase it by
retrofitting GMP factoring code.  :o)

-- 
Torbjörn





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Jim Meyering
Torbjorn Granlund wrote:
...
* We have some hardwired W_TYPE_SIZE settings for the code interfacing
  to longlong.h.  It is now 64 bits.  It will break on systems where
  uintmax_t is not a 64-bit type.  Please see the beginning of
  factor.c.

   I wonder how many types of systems would be affected.

 It is not used currently anywhere in coreutils?  Perhaps coreutils could

uintmax_t is used throughout coreutils, but nowhere (that comes to mind)
does it fail when UINTMAX_MAX happens to be different than 2^64-1.
What I was wondering is how many systems have a uintmax_t that is
only 32 bits wide.  Now that I reread, I suppose this code would be
ok (albeit slower) with uintmax_t wider than 64.

 use autoconf for checking this?  (If we're really crazy, we could speed
 the factor program by an additional 20% by using blocked input with
 e.g. fread.)

 Please take a look at the generated code for factor_using_division,
 towards the end where 8 imulq should be found (on amd64).  The code uses
 mov, imul, cmp, jbe for testing the divisibility of a prime; the branch
 is taken when the prime divides the number being factored, thus highly
 non-taken.  (I suppose we could do a better job at describing the maths,
 with some references.  This particular trick is from Division by
 invariant integers using multiplication.)

Any place you can add a reference would be most welcome.

Here's one where I'd appreciate a reference in a comment:

  #define MAGIC64 ((uint64_t) 0x0202021202030213ULL)
  #define MAGIC63 ((uint64_t) 0x0402483012450293ULL)
  #define MAGIC65 ((uint64_t) 0x218a019866014613ULL)
  #define MAGIC11 0x23b

  /* Returns the square root if the input is a square, otherwise 0. */
  static uintmax_t
  is_square (uintmax_t x)
  {
/* Uses the tests suggested by Cohen. Excludes 99% of squares before
   computing the square root. */
if (((MAGIC64  (x  63))  1)
 ((MAGIC63  (x % 63))  1)
/* Both 0 and 64 are squares mod (65) */
 ((MAGIC65  ((x % 65)  63))  1)
 ((MAGIC11  (x % 11)  1)))
  {
uintmax_t r = isqrt (x);
if (r*r == x)
  return r;
  }
return 0;
  }





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Niels Möller
Jim Meyering j...@meyering.net writes:

 The existing code can factor arbitrarily large numbers quickly, as long
 as they have no large prime factors.  We should retain that capability.

My understanding is that most gnu/linux distributions build coreutils
without linking to gmp. So lots of users don't get this capability.

If this is an important feature, maybe one should consider bundling
mini-gmp and use that as a fallback in case coreutils is configured
without gmp (see
http://gmplib.org:8000/gmp/file/7677276bdf92/mini-gmp/README). I would
expect it to be a constant factor (maybe 10) times slower than the real
gmp for numbers up to a few hundred bits (for larger numbers, it gets
much slower due to lack of sophisticated algorithms, but we probably
can't factor them in reasonable time anyway).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Niels Möller
Torbjorn Granlund t...@gmplib.org writes:

 There is one other place where some (hypothetical) portability problems
 may exist, and that's make-prime-list.c.  It prints a list of uintmax_t
 literals.

I don't think the prime sieving is not a problem, but for each (odd)
prime p, it also computes p^{-1} mod 2^{bits} and floor ( (2^{bits} - 1)
/ p), where bits is the size of an uintmax_t. This will break cross
compilation, if uintmax_t is of different size on build and host system,
or if different suffixes (U, UL, ULL) are needed in the generated
primes.h.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Pádraig Brady

On 09/07/2012 09:43 AM, Niels Möller wrote:

Jim Meyeringj...@meyering.net  writes:


The existing code can factor arbitrarily large numbers quickly, as long
as they have no large prime factors.  We should retain that capability.


My understanding is that most gnu/linux distributions build coreutils
without linking to gmp. So lots of users don't get this capability.

If this is an important feature, maybe one should consider bundling
mini-gmp and use that as a fallback in case coreutils is configured
without gmp (see
http://gmplib.org:8000/gmp/file/7677276bdf92/mini-gmp/README). I would
expect it to be a constant factor (maybe 10) times slower than the real
gmp for numbers up to a few hundred bits (for larger numbers, it gets
much slower due to lack of sophisticated algorithms, but we probably
can't factor them in reasonable time anyway).


Bundling libraries is bad if one needed to update it.
The correct approach here is to file a bug against
your distro to enable gmp which is trivial matter
of adding the build and runtime dependency on gmp.

cheers,
Pádraig.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Pádraig Brady

On 09/07/2012 07:19 AM, Jim Meyering wrote:

There have been enough changes (mostly typo fixes) that I'm re-posting
these for review before I push.  Also, I added this sentence to NEWS
about the performance hit, too

 The fix makes factor somewhat slower (~25%) for ranges of consecutive
 numbers, and up to 8 times slower for some worst-case individual numbers.


Thanks for collating all the tweaks.
+1

Pádraig.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Niels Möller
Pádraig Brady p...@draigbrady.com writes:

 On 09/07/2012 09:43 AM, Niels Möller wrote:

 If this is an important feature, maybe one should consider bundling
 mini-gmp

 Bundling libraries is bad if one needed to update it.

mini-gmp is not an ordinary library. It's a single portable C source
file (currently around 4000 lines) implementing a subset of the GMP API,
and with performance only a few times slower than the real thing, for
small bignums. It's *intended* for bundling with applications, either
for unconditional use, or for use as a fallback if the real gmp library
is not available. It's never (I hope!) going to be installed in
/usr/lib. To me, coreutil's factor seem to be close match for what it's
intended for.

That said, mini-gmp is pretty new (I wrote most of it around last
Christmas) and I'm not aware of any application or library using it yet.
I think the guile hackers are considering using it (for the benefit of
applications which use guile as an extension language, but don't need
high performance bignums).

So if you decide to use it in coreutils, you'll be pioneers.

It *is* used in the GMP build process, for precomputing various internal
tables.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Pádraig Brady

On 09/07/2012 11:35 AM, Niels Möller wrote:

Pádraig Bradyp...@draigbrady.com  writes:


On 09/07/2012 09:43 AM, Niels Möller wrote:



If this is an important feature, maybe one should consider bundling
mini-gmp



Bundling libraries is bad if one needed to update it.


mini-gmp is not an ordinary library. It's a single portable C source
file (currently around 4000 lines) implementing a subset of the GMP API,
and with performance only a few times slower than the real thing, for
small bignums. It's *intended* for bundling with applications, either
for unconditional use, or for use as a fallback if the real gmp library
is not available. It's never (I hope!) going to be installed in
/usr/lib. To me, coreutil's factor seem to be close match for what it's
intended for.

That said, mini-gmp is pretty new (I wrote most of it around last
Christmas) and I'm not aware of any application or library using it yet.
I think the guile hackers are considering using it (for the benefit of
applications which use guile as an extension language, but don't need
high performance bignums).

So if you decide to use it in coreutils, you'll be pioneers.

It *is* used in the GMP build process, for precomputing various internal
tables.


I can see the need when bootstrapping,
but I'd prefer if coreutils just relied on regular GMP.

That said, I see there is some push back in debian on depending on GMP.
Note expr from coreutils also uses GMP, which may sway the decision.

thanks,
Pádraig.





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Jim Meyering wrote:

Linda Walsh wrote:
...

GNU needs to be clear their priorities -- maintaining software
freedom, or bowing down to corporate powers...  POSIX isn't


While POSIX is in general a very good baseline, no one here conforms
blindly.  If POSIX is wrong, we'll lobby to change it, or, when
that fails, maybe relegate the undesirable required behavior to when
POSIXLY_CORRECT is set, or even simply ignore it.  In fact, over the
years, I have deliberately made a few GNU tools contravene some aspects
of POSIX-specified behavior that I felt were counterproductive.




We try to make the tools as useful as possible, sometimes adding features
when we deem them worthwhile.  However, we are very much against changing
the *default* behavior (behavior that has been that way for over 20
years and that is compatible with all other vendor-supplied rm programs)
without a very good reason.



So if I make it enabled with an ENV var set to RM_FILES_DEPTH_FIRST, to 
enable
the behavior, then you'd have no problem accepting the patch?






bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Eric Blake
On 09/07/2012 08:16 AM, Linda Walsh wrote:

 We try to make the tools as useful as possible, sometimes adding features
 when we deem them worthwhile.  However, we are very much against changing
 the *default* behavior (behavior that has been that way for over 20
 years and that is compatible with all other vendor-supplied rm programs)
 without a very good reason.
 
 
 So if I make it enabled with an ENV var set to RM_FILES_DEPTH_FIRST,
 to enable
 the behavior, then you'd have no problem accepting the patch?

I personally detest new env-vars that change long-standing behavior,
because you then have to audit EVERY SINGLE SCRIPT to ensure that its
use is unimpacted if the new env-var is set.  It must either be an
existing env-var, or my personal preference of a new --long-option.  But
if you want to submit a patch so that 'rm -r --depth-first .' does what
you want, I'm probably 60-40 in favor of including it.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Eric Blake wrote:

I personally detest new env-vars that change long-standing behavior,
because you then have to audit EVERY SINGLE SCRIPT to ensure that its
use is unimpacted if the new env-var is set.  It must either be an
existing env-var, or my personal preference of a new --long-option.  But
if you want to submit a patch so that 'rm -r --depth-first .' does what
you want, I'm probably 60-40 in favor of including it.

---
I wouldn't be opposed to adding it in addition, but I don't want the extra
typing for what is the more common case for me, but given that the current
behavior is to return an error -- and there is an expectation of being able to 
type
in non-working commands just to see the error message -- imagine their surprise
and how they would curse if you added an option that actually made that 
previously
illegal action, work.

Most of them who type in random wrong commands just to see error messages aren't
smart enough to use environment variables.






bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Paul Eggert
On 09/07/2012 03:35 AM, Niels Möller wrote:
 It's *intended* for bundling with applications, either
 for unconditional use, or for use as a fallback if the real gmp library
 is not available.

I've been looking for something like that for Emacs, since I want
Emacs to use bignums.  Do you think it'd be suitable?

One hassle I have with combining Emacs and GMP is that
Emacs wants to control how memory is allocated, and wants its
memory allocator to longjmp out if memory gets low, and GMP
is documented to not support that.  If the mini-gmp library
doesn't have this problem I'm thinking that Emacs might use
it *instead* of GMP.






bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Bob Proulx wrote:


Because I originally voted that this felt like a bug I wanted to state
that after determining that this has already been legacy system
historical practice for a very long time that I wouldn't change it
now.  Portability of applications is more important.


Right now, the feature is unused, So hurting compatibility
is not an issue - users can push for the feature on other systems as needed.

It certainly isnt' a safety issue, since
rm **
on a bouncy keyboard is alot easier to type than rm -fr *,
and the former will remove all files under the current dir
(just none of the directories)

I suppose rm **;rmdir ** would work -- but but require SHELL.

I think adding the case of rm -fr . or (dirname/.) to delete contents
of the dir, but not the dir itself makes more sense and is safer than
the easier to type rm **




This isn't a feature that could be working in a script for someone.
It isn't something that was recently removed that would cause a script
to break.  A script will run now with the same behavior across
multiple different types of systems.  I think we should leave things
unchanged.


It's only been recently that I've noticed rm -fr . not working and
I can't figure out why since it hasn't been around for so long.


Consider the parallel, if I want to make sure I copy the contents
of a dir, I need to use cp dir/. dest/

If I use dir/ or dir, they both end up in dest (even with the /).

That means without using ., the contents are not addressable, so what is
demanded by 'cp', is refused by 'rm'.

That is not a consistent user interface and is symptomatic of poor design.

Using . to reference content of a dir is standard in other utils -- that
it doesn't work in 'rm' goes counter to the idea of how rm works -- you have
to remove contents before trying the current dir.  It isn't logical to think
that it would try the current dir before anything else -- as it goes completely
contrary to how rm has to work.

I say it's a design flaw and inconsistent with other programs.

But if I can set an env var and have it work on my system, someone else
can work to get the core utils to work consistently...







bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Eric Blake wrote:


Then set up a shell alias or a wrapper script that comes first in your
$PATH.  Then it is under your explicit control, while the default is
still appropriate for everyone else.

Just because the defaults don't match your expectations doesn't mean you
can't change the behavior _on your system_ to avoid extra typing on your
part.

---
Doesn't work for programs that need to call rm to remove all files
in a dir.

And I already have changed default behavior on MY systems.


I already added the patch below -- that only does my behavior if the
user isn't running in POSIXLY_CORRECT mode.


Now it's certainly easier to set an env var 1 place to control script
behavior than making changes in all the places...

I'm just trying to get the env var in so I don't have to distribute
a version of rm with my scripts that works.





-

--- src/remove.c2011-10-10 00:56:46.0 -0700
+++ src/remove.c.new2012-09-06 14:28:07.816810683 -0700
@@ -173,6 +173,35 @@
   }
 }

+
+/* separate functions to check for next part of file name being dotdot or...*/
+
+static inline bool
+dotdot (char const *file_name)
+{
+  if (file_name[0] == '.'  file_name[1])
+{
+  char sep = file_name[(file_name[1] == '.') + 1];
+  return (! sep || ISSLASH (sep));
+}
+  else
+return false;
+}
+
+/* dot */
+
+static inline bool
+dot (char const * file_name)
+{
+   if (file_name[0] == '.')
+   {
+   char sep = file_name[1];
+   return (! sep || ISSLASH(sep));
+   }
+   else
+   return false;
+}
+
 /* Prompt whether to remove FILENAME (ent-, if required via a combination of
the options specified by X and/or file attributes.  If the file may
be removed, return RM_OK.  If the user declines to remove the file,
@@ -203,6 +232,7 @@

   int dirent_type = is_dir ? DT_DIR : DT_UNKNOWN;
   int write_protected = 0;
+   int special_delete_content = 0;

   /* When nonzero, this indicates that we failed to remove a child entry,
  either because the user declined an interactive prompt, or due to
@@ -222,7 +252,11 @@
   wp_errno = errno;
 }

-  if (write_protected || x-interactive == RMI_ALWAYS)
+   if (!x-posix_correctly  dot(filename)  !x-force)
+   special_delete_content = 1;
+
+
+  if (write_protected || x-interactive == RMI_ALWAYS || 
special_delete_content)
 {
   if (0 = write_protected  dirent_type == DT_UNKNOWN)
 {
@@ -281,11 +315,16 @@
   if (dirent_type == DT_DIR
mode == PA_DESCEND_INTO_DIR
!is_empty)
-fprintf (stderr,
- (write_protected
-  ? _(%s: descend into write-protected directory %s? )
-  : _(%s: descend into directory %s? )),
- program_name, quoted_name);
+   {
+   char * action = special_delete_content
+  ? 
_(delete contents of)
+  : 
_(descend into);

+   fprintf (stderr,
+ 
(write_protected
+  ? 
_(%s: %s write-protected directory %s? )
+  : 
_(%s: %s directory %s? )),
+action, 
program_name, quoted_name);

+   }
   else
 {
   if (cache_fstatat (fd_cwd, filename, sbuf, AT_SYMLINK_NOFOLLOW) != 0)
@@ -476,7 +515,8 @@

   /* If the basename of a command line argument is . or ..,
  diagnose it and do nothing more with that argument.  */
-  if (dot_or_dotdot (last_component (ent-fts_accpath)))
+  if ( (x-posix_correctly ? dot_or_dotdot : dotdot)
+   (last_component 
(ent-fts_accpath)))

 {
   error (0, 0, _(cannot remove directory: %s),
  quote (ent-fts_path));
--- src/remove.h2011-07-28 03:38:27.0 -0700
+++ src/remove.h2012-09-06 13:33:01.282362765 -0700
@@ -34,6 +34,14 @@
   /* If true, ignore nonexistent files.  */
   bool ignore_missing_files;

+   /* true if force (-f) was specified indicating user knows what they
+* are doing and don't want to questioned or see errors from command */
+   bool force;
+
+   /* true for users wanting strict posix compliance of more flexible, lax,
+* or useful behaviors */
+   bool posix_correctly;
+
   /* If true, query the user about whether to remove each file.  */
   enum rm_interactive interactive;

--- src/rm.c2011-10-02 02:20:54.0 -0700
+++ src/rm.c2012-09-06 13:33:04.132500554 -0700
@@ -206,6 

bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Eric Blake
On 09/07/2012 08:54 AM, Linda Walsh wrote:
 
 Using . to reference content of a dir is standard in other utils -- that
 it doesn't work in 'rm' goes counter to the idea of how rm works -- you
 have
 to remove contents before trying the current dir.  It isn't logical to
 think
 that it would try the current dir before anything else -- as it goes
 completely
 contrary to how rm has to work.

At the syscall level, unlink(.) is required to fail.  To remove a
directory, you must remove its proper name.  You can use
unlink(../child) on systems like Linux that let you remove a directory
that is used as a process' current working directory (on systems like
Windows where this action is forbidden, there's no way to remove the
current working directory).  Therefore, at the shell level, POSIX will
let you do 'rm -r ../child'.  If you think that POSIX should _also_ let
you attempt 'rm -r .', then propose that as a defect report against
POSIX, rather than griping here.

 
 I say it's a design flaw and inconsistent with other programs.

I would say that it is not a design flaw, but that it is consistent with
the fact that the unlink(.) syscall is required to fail, and that it
is consistent with other Unix implementations.  We can agree to disagree
on that point.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12366: [gnu-prog-discuss] bug#12366: Writing unwritable files

2012-09-07 Thread Paolo Bonzini
Il 06/09/2012 19:23, Paul Eggert ha scritto:
  The file replacement is atomic.  The reading of the file is not.
 Sure, but the point is that from the end user's
 point of view, 'sed -i' is not atomic, and can't
 be expected to be atomic.

Atomic file replacement is what matters for security.

Paolo





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Torbjorn Granlund
Jim Meyering j...@meyering.net writes:

  uintmax_t is used throughout coreutils, but nowhere (that comes to mind)
  does it fail when UINTMAX_MAX happens to be different than 2^64-1.
  What I was wondering is how many systems have a uintmax_t that is
  only 32 bits wide.  Now that I reread, I suppose this code would be
  ok (albeit slower) with uintmax_t wider than 64.
  
The code with work with longlong.h iff W_TYPE_SIZE is defined to the
bitsize of uintmax_t.

  Any place you can add a reference would be most welcome.
  
I have added comments here and there.  More comments might be desirable.

  Here's one where I'd appreciate a reference in a comment:
  
#define MAGIC64 ((uint64_t) 0x0202021202030213ULL)
#define MAGIC63 ((uint64_t) 0x0402483012450293ULL)
#define MAGIC65 ((uint64_t) 0x218a019866014613ULL)
#define MAGIC11 0x23b
  
I added a comment explaining these constants.

Here is a new version of the code.  It now has GMP factoring code,
updated from the GMP demos code.



nt-factor-002.tar.lz
Description: Binary data

-- 
Torbjörn


bug#12366: [gnu-prog-discuss] bug#12366: Writing unwritable files

2012-09-07 Thread Paul Eggert
On 09/07/2012 09:38 AM, Paolo Bonzini wrote:

 Atomic file replacement is what matters for security.

Unfortunately, 'sed's use of atomic file replacement does not
suffice for security.

For example, suppose sysadmins (mistakenly) followed the practice of
using 'sed -i' to remove users from /etc/passwd.  And suppose there
are two misbehaving users moe and larry, and two sysadmins bonzini and
eggert.  bonzini discovers that moe's misbehaving, and types:

  sed -i '/^moe:/d' /etc/passwd

and thinks, Great! moe can't log in any more.  Similarly eggert
discovers that larry's misbehaving, and types:

  sed -i '/^larry:/d' /etc/passwd

and thinks, All right!  I've done my job too.

Unfortunately, it could be that moe can still log in afterwards.  Or
maybe larry can.  We don't know, because 'sed -i' is not atomic, which
means /etc/passwd might contain moe afterwards, or maybe larry.

Of course one could wrap 'sed -i' inside a larger script, that
arranges for atomicity at the end-user level.  But the same is true
for 'sort -o'.  Perhaps the method of 'sed -i' buys the user
*something*, but whatever that something is, isn't immediately
obvious.  When it comes to security mechanisms, simplicity and clarity
are critical, and unfortunately 'sed -i' has problems in this area,
just as 'sort -o' does.
 





bug#12366: [gnu-prog-discuss] bug#12366: Writing unwritable files

2012-09-07 Thread Bob Proulx
Paul Eggert wrote:
 Paolo Bonzini wrote:
  Atomic file replacement is what matters for security.
 
 Unfortunately, 'sed's use of atomic file replacement does not
 suffice for security.
 
 For example, suppose sysadmins (mistakenly) followed the practice of
 using 'sed -i' to remove users from /etc/passwd.  And suppose there
 are two misbehaving users moe and larry, and two sysadmins bonzini and
 eggert.  bonzini discovers that moe's misbehaving, and types:
 
   sed -i '/^moe:/d' /etc/passwd

Using /etc/passwd isn't a good example because system convention
dictates that a /etc/passwd.lock must be observed for any edits there
specifically for the problem you are illustrating.  The above would
not be correct even if sed were fully atomic overall.

 Of course one could wrap 'sed -i' inside a larger script, that
 arranges for atomicity at the end-user level.

Right.  The 'vipw' script for example.  :-)

[I have abused the EDITOR variable for that purpose many times.  Set it
to either an inline script or to a real script and use it to safely
edit these types of files.  More with 'visudo' though.]

Bob





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Eric Blake wrote:

On 09/07/2012 08:54 AM, Linda Walsh wrote:

Using . to reference content of a dir is standard in other utils -- that
it doesn't work in 'rm' goes counter to the idea of how rm works -- you
have
to remove contents before trying the current dir.  It isn't logical to
think
that it would try the current dir before anything else -- as it goes
completely
contrary to how rm has to work.


At the syscall level, unlink(.) is required to fail.  To remove a
directory, you must remove its proper name.  You can use
unlink(../child) on systems like Linux that let you remove a directory
that is used as a process' current working directory (on systems like
Windows where this action is forbidden, there's no way to remove the
current working directory).  Therefore, at the shell level, POSIX will
let you do 'rm -r ../child'.  If you think that POSIX should _also_ let
you attempt 'rm -r .', then propose that as a defect report against
POSIX, rather than griping here.


I say it's a design flaw and inconsistent with other programs.


I would say that it is not a design flaw, but that it is consistent with
the fact that the unlink(.) syscall is required to fail, and that it
is consistent with other Unix implementations.  We can agree to disagree
on that point.


Really, I didn't say rm -fr . should *delete* the current directory -- IT SHOULD
FAIL -- you are 100% correct.

But it is true that anyone who knows the smallest bit about unix knows
that you have to empty the directory before deleting the directory, and,
thus, rm _MUST_ do a depth first traversal.  If it did and gave an error
at the end: no issue.

It's the special check BEFORE doing the work it would normally do, and failing
BEFORE, it does it's NORMAL work -- the depth first deletion, that I am against.

Griping against POSIX is like griping against the government. But very
few people always go the speed limit and would regard a vehicle that is
*unable* function normally, as faulty.

So I am not disagreeing that it should fail, Please be clear about what I
am asking.   Also, I would *expect*, that rm -r . would at least,  _ask_ you
if you wanted to remove files under a directory that will be unable to be 
deleted.

I am only asking for the behavior I describe to work without issuing an error
when I do rm -fr  I am specifically asking rm to forcefully remove what it
can and remove and CONTINUE to delete what it can in spite of any errors it 
might
encounter.   Again, the fact that this fails defies normal logic with 'rm'.  I 
don't
believe that rm -fr . has failed for 20 years.  I don't know when it changed, 
but it used
to be that rm didn't have a special check for . -- BECAUSE -- as you 
mention, an attempt

to unlink . will fail -- Using the -f suppresses any error message.










bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Eric Blake
On 09/07/2012 02:56 PM, Linda Walsh wrote:

 Really, I didn't say rm -fr . should *delete* the current directory --
 IT SHOULD
 FAIL -- you are 100% correct.
 
 But it is true that anyone who knows the smallest bit about unix knows
 that you have to empty the directory before deleting the directory, and,
 thus, rm _MUST_ do a depth first traversal.  If it did and gave an error
 at the end: no issue.

Indeed, reading the original V7 source code from 1979:
http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/rm.c

while(--argc  0) {
if(!strcmp(*++argv, ..)) {
fprintf(stderr, rm: cannot remove `..'\n);
continue;
}
rm(*argv, fflg, rflg, iflg, 0);
}

shows that _only_ .. was special, . was attempted in-place and
didn't fail until the unlink(.) after the directory itself had been
emptied.  It wasn't until later versions of code that . also became
special.

You therefore may have a valid point that POSIX standardized something
that did not match existing practice at the time, and therefore, it
would be reasonable to propose a POSIX defect that requires early
failure on .., but changes the behavior on . and / to only permit,
but not require, early failure.  However, I just checked, and the
prohibition for an early exit on . has been around since at least
POSIX 2001, so you are now coming into the game at least 11 years late.

So, until you take it up with the POSIX folks, I don't think anyone on
the coreutils side cares enough to bother changing the default behavior,
now that it has been standardized, and even though the standardized
behavior is tighter than the original pre-standard behavior.

 Griping against POSIX is like griping against the government.

No, I actually find the Austin Group quite reasonable to work with,
especially if you can provide backup evidence like the V7 source snippet
I just mentioned.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Eric Blake wrote:


You therefore may have a valid point that POSIX standardized something
that did not match existing practice at the time, and therefore, it
would be reasonable to propose a POSIX defect that requires early
failure on .., but changes the behavior on . and / to only permit,
but not require, early failure.  However, I just checked, and the
prohibition for an early exit on . has been around since at least
POSIX 2001, so you are now coming into the game at least 11 years late.


Those changes only started hitting the field a few years ago.

Bash just started working to adopted the 2003 standard with
it's 4.0 version -- before that it was 1999 -- I didn't even know
there was a 2001

Except that trying to get them to change things now, I'd encounter
the same arguments I get here -- that users expect to be able have -f
not really mean force -- and to report errors on ..

Not that I believe that, -- I just think most users aren't
aware or don't care, but that would be the reasoning.   I get it here,
why would I expect someone who's job is to come up with lame rules that
defy standard practice (last I looked they were proposing to ban space
(as well as 0x01-0x1f) in file names).   Attempting to deal with people
who want to turn POSIX into a restriction document -- not a standard
reflecting current implementations, is well beyond my social abilities.

I can't even get engineers -- when faced with clear evidence
of programs that put out inconsistent output to fix them.  They know it's
bad output -- and even warn that they are about to do the wrong thing
in warnings.   Somehow this is considered preferable to doing something
useful.

So expecting a group that is heavily into bureaucracy to listen to
reason just doesn't seem like a reasonable expectation.

I did go to their website though and see what they were discussing,
and when I saw that sentiment was going in favor of limiting allowed characters
in filenames, I was to ill to stay.








bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Alan Curry
Eric Blake writes:
 
 Indeed, reading the original V7 source code from 1979:
 http://minnie.tuhs.org/cgi-bin/utree.pl?file=3DV7/usr/src/cmd/rm.c
 
[...]
 
 shows that _only_ .. was special, . was attempted in-place and
 didn't fail until the unlink(.) after the directory itself had been
 emptied.  It wasn't until later versions of code that . also became
 special.

I also decided to look around there, and found some of the turning points:

Up to 4.2BSD, the V7 behavior was kept.
(http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.2BSD/usr/src/bin/rm.c)

rm -rf . was forbidden in 4.3BSD (26 years ago).
http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD/usr/src/bin/rm.c

The removal of dir/. (and dir/..) was not forbidden until Reno.
http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/bin/rm/rm.c
cp = rindex(arg, '/');
if (cp == NULL)
cp = arg;
else
++cp;
if (isdot(cp)) {
fprintf(stderr, rm: cannot remove `.' or `..'\n);
return (0);
}

Maybe the classical behavior stuck around longer in the more SysV-ish Unices.
The Ultrix-11 3.1 tree on TUHS from 1988 has a rm that looks very much like
V7, but I can't find anything to compare it to until OpenSolaris.

Did POSIX force BSD to change their rm in 1988? I think it's more likely that
POSIX simply documents a restriction that BSD had already added. Either way
the latest POSIX revisions certainly can't be blamed.

-- 
Alan Curry





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Eric Blake
On 09/07/2012 03:30 PM, Linda Walsh wrote:
 Not that I believe that, -- I just think most users aren't
 aware or don't care, but that would be the reasoning.   I get it here,
 why would I expect someone who's job is to come up with lame rules that
 defy standard practice (last I looked they were proposing to ban space
 (as well as 0x01-0x1f) in file names).

You aren't looking very hard, then.  The proposal currently being
considered by the Austin Group is a ban on newline (and newline only)
from file names, because that is the one and only character whose
presence in file names causes ambiguous output for line-oriented tools.
 Forbidding space and most non-printing control characters was rejected
as impractical.  And even with the proposed ban on newline, it is still
just that - a proposal, and not a hard rule, waiting for implementation
practice to see if it is even doable.

http://www.austingroupbugs.net/view.php?id=251

Read the whole thing.  The original poster mentioned a much tighter
bound, but it was shot down, with the _only_ thing being left on the
table under current discussion is _just_ the limitation of newline.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Eric Blake
On 09/07/2012 03:20 PM, Eric Blake wrote:
 Indeed, reading the original V7 source code from 1979:
 http://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/rm.c
 

 shows that _only_ .. was special, . was attempted in-place and
 didn't fail until the unlink(.) after the directory itself had been
 emptied.  It wasn't until later versions of code that . also became
 special.
 
 You therefore may have a valid point that POSIX standardized something
 that did not match existing practice at the time, and therefore, it
 would be reasonable to propose a POSIX defect that requires early
 failure on .., but changes the behavior on . and / to only permit,
 but not require, early failure.  However, I just checked, and the
 prohibition for an early exit on . has been around since at least
 POSIX 2001, so you are now coming into the game at least 11 years late.

In addition to Alan's argument that 4.3BSD forbade '.' before POSIX
began (and therefore the POSIX folks DID standardize existing practice,
even it wasn't universally common at the time), I find this statement
from POSIX quite informative (line 104265 in POSIX 2008), on why any
proposal to allow 'rm -rf .' to remove non-dot files will probably be
denied:

 The rm utility is forbidden to remove the names dot and dot-dot in order to 
 avoid the consequences of inadvertently doing something like:
 rm −r .*

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Eric Blake wrote:

The rm utility is forbidden to remove the names dot and dot-dot in order to 
avoid the consequences of inadvertently doing something like:
rm −r .*

---
Which is why, IMO, I thought rm -r .* should ask if they really want to remove
all files under . as the first question, as it would show up first in such
a situation.

As stated before, I am more interested in the -f=force it anyway option,
that says to let it fail, and continue, ignoring failure.

I think that may be where the problem has been introduced.

I never used rm - .

Certainly rm ** is easier to mistype than rm -r .* so by that logic, that
should be disallowed as well?

I submit it is the behavior of -f that has changed -- and that it
used to mean force -- continue in spite of errors, and it is
that behavior that has changed, as I would would always have expected
rm -r . to at least return some error I didn't care about -- What I
wanted was the depth-first removal, and -f to force it to continue despite
errors.

How long has -f NOT meant --force -- as now it only overlooks write
protection errors which sounds very weak.






bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Eric Blake
On 09/07/2012 09:02 AM, Linda Walsh wrote:

 
 --- src/remove.c2011-10-10 00:56:46.0 -0700
 +++ src/remove.c.new2012-09-06 14:28:07.816810683 -0700

Thanks for making an attempt to show what you want in code.  However,
you provided no ChangeLog entry, no mention in NEWS and no
documentation.  Also, you do not have copyright assignment on file with
the FSF (but if you'd like to pursue this patch further, we can help you
complete the copyright assignment paperwork).  Therefore, this patch
cannot be taken as-is.

 @@ -203,6 +232,7 @@
 
int dirent_type = is_dir ? DT_DIR : DT_UNKNOWN;
int write_protected = 0;
 +   int special_delete_content = 0;

Furthermore, your indentation appears hideous in this email; I'm not
sure you created the patch, and whether this is an artifact of your
mailer corrupting things or whether you really did disregard existing
indentation, but you'd have to clean that up before your patch can be
anywhere worth including.

 +   char * action =
 special_delete_content
 + 
 ? _(delete contents of)
 + 
 : _(descend into);
 +   fprintf (stderr,
 + (write_protected
 + 
 ? _(%s: %s write-protected directory %s? )
 + 
 : _(%s: %s directory %s? )),

This is a translation no-no (not to mention that your hideous
indentation made it hard to read because it was so much longer than 80
columns).  Please don't split English sentences across two separate _()
calls that are then pasted together, but rather write two _() calls of
the two complete sentences.

 +++ src/rm.c2012-09-06 13:33:04.132500554 -0700
 @@ -206,6 +206,7 @@
bool preserve_root = true;
struct rm_options x;
bool prompt_once = false;
 +   x.posix_correctly = (getenv (POSIXLY_CORRECT) != NULL );

Elsewhere in coreutils, we name such a variable posixly_correct, not
posix_correctly.

And finally, remember my advice - if you want this mode, add it as a new
long option, and NOT as an abuse of POSIXLY_CORRECT, if you want to
avoid controversy and even stand a chance of getting it approved for
inclusion.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Linda Walsh



Alan Curry wrote:

Eric Blake writes:



if (isdot(cp)) {
fprintf(stderr, rm: cannot remove `.' or `..'\n);
return (0);

---
The thing is, by doing rm -rf on ., I am not trying to remove . or ..
I'm trying to remove the files in it.

Other wise there is no way to specify, using rm to delete the contents
of a directory  but not the directory itself.

I just want to clean out a directory -- I don't want to try to delete the
directory itself.  4.3BSD is breaking the intended functionality of
a depth first *requirement* for rm to remove a dir.  It's well known
that the contents must be removed before ., so trying to remove . should
only fail after the depth first traversal -- that would be expected behavior.
I'm NOT trying to remove .   I'm using it to indicated that I only want the
contents of the directory deleted.   It's like using / with a symlink to get 
at
the dir, except that doesn't work reliably -- the only thing that works reliably
to get at the contents is .

Try cp|mv -r dir1/ dir2/ -- they'll move dir1 under dir2, the same as without
the -r.  The only way for -r to address content is by using . as the 
source,
in
cp -a src/. dst[/]

Note cp -a src/. dst/. works, but try to mv, and you get the type of error 
you'd
expect from trying to destroy b/.

Note the copy works but not the mv.  cp should realize it is writing . to . 
which

is illegal -- but it allows it because it knows what the user *wants* to do.

'mv', is stricter -- as you want to move one inode over the top of another.

As a source address, . is allowed to mean content, but as a destination, it 
isn't
allowed as a writeable destination.

That's the problem with rm, when uing rm -r '.' that specifies the start of a
recursive operation where writes will begin, depth first.  It's not until the
very end that it tries to do the illegal write (delete) operation to ..

If 'cp' allows . as a source, (and even accommodates it as a target for cp,
so should 'rm' allow . to mean the start of a deletion, but NOT the dir 
itself.

That's the interpretation that cp is using -- as if it was trying to cp the dir 
itself

over the top of the target, it would give an error message, but it doesn't.







bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Torbjorn Granlund
I found a problem with the GMP integration.

We have a 100 byte buffer in the stdin reading code, which was adequate
before we used GMP, but now one might want to attempt to factor much
larger numbers.

We'll fix that, but not tonight.

-- 
Torbjörn





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it is documented to do so.

2012-09-07 Thread Linda Walsh



Eric Blake wrote:
...codeing stuff...

Thanks for the advice... will take it appreciatively, however
it was a few hours effort in unfamiliar code.

I certainly wouldn't write NEWS/CHANGES if I didn't have an initial
agreement that it would go in.


More to the point.   Others are objecting, (I'm willing to admit
some reasonability in the objection) to changing the default behavior.

I proposed adding a ENV var that would need to be specified to get
the new behavior.   Thus it *would not* be changing default behavior.


Does everyone get that... as that's been offered as a an acceptable
compromise.


Vs. the option of adding it as a long option -- that's pushing it
too far -- and doesn't work for me. as it's easier for me to maintain
and distribute a patch to rm or my own version than it is to have it
as a long option.  Reason:  doing it in rm OR as an ENV var does it
in one place and all my code/interactivity benefits .   Doing it in
a long option -- must be paid for on each use.  The cost doesn't
work for me nor would it work for anyone considering cost v. benefit.






bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Eric Blake
On 09/07/2012 06:02 PM, Linda Walsh wrote:

 The thing is, by doing rm -rf on ., I am not trying to remove . or ..
 I'm trying to remove the files in it.
 
 Other wise there is no way to specify, using rm to delete the contents
 of a directory  but not the directory itself.

Yes there is, and Paul already told it to you:

rm -rf * .[!.] .??*

 
 I just want to clean out a directory -- I don't want to try to delete the
 directory itself.

Then use the triple-glob.  This is portable to both POSIX and to the old
implementations we have been discussing.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Linda Walsh



Eric Blake wrote:

On 09/07/2012 06:02 PM, Linda Walsh wrote:


The thing is, by doing rm -rf on ., I am not trying to remove . or ..
I'm trying to remove the files in it.

Other wise there is no way to specify, using rm to delete the contents
of a directory  but not the directory itself.


Yes there is, and Paul already told it to you:

rm -rf * .[!.] .??*



You must have missed that rm doesn't expand shell globs... and I don't
want to get the shell involved for rm'ing files anymore than cp needs
to to copy directories or the files in a dir and not the dir.





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Paul Eggert
On 09/07/2012 06:25 PM, Linda Walsh wrote:
 I don't want to get the shell involved

That's not a reasonable constraint.  The shell is
a standard tool for solving this sort of problem,
and involving the shell solves this problem.





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Linda Walsh

The shell is one of the things I'm trying not to have a dependency on.  It
doesn't pass a reliability test as it does too much.

I want a utility that removes files -- single files or depth recursive
and it can fail on the current dir or target dir -- after
finishes like it is documented to do .. or it can fail at
the beginning, as long as the -f option said to ignore errors
and keep going.

I don't want a wildcard solution.  It's a wildcard. (Doh!)

The issue was changing the default no?

You don't think I'm being reasonable by agreeing and saying
a non default environment var?

Why should cp accept . as addressing the contents of
a directory, but rm be deliberately crippled?  We've
excluded safety, since no one will get to the option unless
they've enabled it and unless they choose force, since the
they should get a prompt asking if they want to delete the
delete the files in protected directory ., right?

So far no one has addressed when the change in -f' went in
NOT to ignore the non-deletable dir . and continue recursive delete,
as normal behavior would have it do.  Posix claims the rational is to
prevent accidental deletion of all cur-dir when using rm -r ., however
if it is queried in that case, and noting that rm ** is just as dangerous,
but not as clean as it 'only' deletes all files, and leaves the dir skeleton.

So posix's rationals wouldn't seem to be 1) that important, and 2) would be
addressed by sticking with the current behavior or prompting first -- which 
would
make more sense rather than arbitrarily deciding for them and removing
the ability for rm to remove just files -- by itself. with no other utilities 
invoked.


Why shouldn't rm have the ability to only target everything under a specified 
point,

rather than always including the point?








bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Alan Curry
Linda Walsh writes:
 
 So far no one has addressed when the change in -f' went in
 NOT to ignore the non-deletable dir . and continue recursive delete,

In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the -f
option is not consulted before rejecting removal of . so I don't think the
change you're referring to is a change at all. -f never had the effect you
think it should have.

-- 
Alan Curry





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Linda Walsh



Alan Curry wrote:

Linda Walsh writes:

So far no one has addressed when the change in -f' went in
NOT to ignore the non-deletable dir . and continue recursive delete,


In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the -f
option is not consulted before rejecting removal of . so I don't think the
change you're referring to is a change at all. -f never had the effect you
think it should have.


If I was using BSD, I would agree.
---
But most of my usage has been on SysV compats Solaris, SGI, Linux, a short
while on SunOS back in the late 80's, but that would have been before it
changed anyway.

For all i know it could have been a vendor addin, but that's
not the whole point here.

Do you want to support making . illegal for all
gnu utils for addressing content?

If not, then we should look at making a decision that it can
be used to address content and ensure the interface is consistent
going forward.

I think you'll find many more people against the idea and wondering
why it's in 'rm' and why -f doesn't really mean ignore all the errors
it can and why that one should be specially treated.  Of course they also might
wonder why rm doesn't follow the necessary algorithm for deleting files --
and delete contents before dying issuing an error for being unable to delete
a parent.  Which might also raise why -f shouldn't be usable to silence 
permission
or access errors as it was designed to.

There are plenty of good reasons aside from BSD historic usage why it should
be designed in, especially when it's being tucked away as a non-default behavior
that would need environmental triggering to even be available.








bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Paul Eggert
On 09/07/2012 08:06 PM, Linda Walsh wrote:

 The shell is one of the things I'm trying not
 to have a dependency on.

That sounds unnecessarily impractical.  It's been
decades since I used a system that had 'rm' but
didn't have a shell that could solve this problem easily.

By the way, Alan, that was a nice trip down memory lane!
I've *used* all those 'rm' implementations.  I'm even old
enough to remember when 'rm' automatically refused to
remove itself.

 You don't think I'm being reasonable by agreeing and saying
 a non default environment var?

No, because then currently-working scripts might have
to be changed to guard against that variable being set.

Clearly you don't agree with the POSIX rationale.
Reasonable people can disagree, and then move on.
The nice thing about free software is that you can
build your system the way you like.  You don't have
to convert the rest of us.





bug#12339: Bug: rm -fr . doesn't dir depth first deletion yet it

2012-09-07 Thread Alan Curry
Linda Walsh writes:
 
 Alan Curry wrote:
  Linda Walsh writes:
  So far no one has addressed when the change in -f' went in
  NOT to ignore the non-deletable dir . and continue recursive delete,
  
  In the historic sources I pointed out earlier (4.3BSD and 4.3BSD-Reno) the 
  -f
  option is not consulted before rejecting removal of . so I don't think the
  change you're referring to is a change at all. -f never had the effect you
  think it should have.
  
 If I was using BSD, I would agree.
 ---
 But most of my usage has been on SysV compats Solaris, SGI, Linux, a short
 while on SunOS back in the late 80's, but that would have been before it
 changed anyway.

SGI is dead, Sun is dead, the game's over, we're the winners, and our rm has
been this way forever.

 
 For all i know it could have been a vendor addin, but that's
 not the whole point here.
 
 Do you want to support making . illegal for all
 gnu utils for addressing content?

I don't think addressing content is a clearly defined operation, no matter
how many times you repeat it.

Consistency between tools is a good thing, but consistency between OSes is
also good, and we'd be losing that if any change was made to GNU rm's default
behavior. Even OpenSolaris has the restriction: see lines 160-170 of
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/cmd/rm/rm.c

 
 I think you'll find many more people against the idea and wondering
 why it's in 'rm' and why -f doesn't really mean ignore all the errors
 it can and why that one should be specially treated.  Of course they also 
 might
 wonder why rm doesn't follow the necessary algorithm for deleting files --
 and delete contents before dying issuing an error for being unable to delete
 a parent.  Which might also raise why -f shouldn't be usable to silence 
 permission
 or access errors as it was designed to.

Look, I agree isn't not logical or elegant. But we have a standard that all
current Unices are obeying, and logic and elegance alone aren't enough to
justify changing that.

A new option that you can put in an alias is really the most realistic goal.

-- 
Alan Curry