bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive

2023-10-23 Thread Niels Möller
Pádraig Brady  writes:

> Will apply the attached later.
> Marking this as done.

Thanks! It would make some sense to me to also have options
--upper/--lower; on encoding, they would specify case of the output, on
decoding, they would reject the other case (with default being to accept
either). But less important than fixing the default behavior.

> +  basenc --base16 -d no supports lower case hexadecimal characters.
> +  Previously an error was given for lower case hex digits.

s/ no / now /

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.





bug#66698: I think hex decoding with basenc -d --base16 should be case-insensitive

2023-10-23 Thread Niels Möller
Hi,

the docs for basenc --base16 says "hex encoding (RFC4648 section 8)".
The referenced section in that RFC says 

  Essentially, Base 16 encoding is the standard case-insensitive hex
  encoding and may be referred to as "base16" or "hex".

I think it would be both more useful, and consistent with docs, if
basenc -d --base16 accepted either upper- or lowercase hex digits.

Current behavior, with basenc (GNU coreutils) 9.1:

  $ echo 666F6F0A |basenc --base16 -d
  foo
  $ echo 666F6f0A |basenc --base16 -d
  fobasenc: invalid input

I think both inputs should give the same output, "foo\n", at least by
default. Possibly configurable with options like --strict, --upper,
--lower, etc (--upper/--lower would be useful also for the --base16
encoding, i.e., no -d).

Regards,
/Niels

-- 
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.






bug#42269: Remove non-GMP code from coreutils factor.c

2020-07-08 Thread Niels Möller
t...@gmplib.org (Torbjörn Granlund) writes:

> If any code is to be removed, then that would be the GMP code of
> coreutils factor.

I agree with Torbjörn. The GMP code in GNU factor might have made more
sense when most computers were 32-bit, and "bignums" were smaller.

But on 64-bit computers, the GMP code would be used only for numbers
above 127 bits. I'm really not that familiar with state of the art
factoring, but I'd guess pollard rho is a bad algorithm choice for that
range, and one ought to use, e.g., some variant of the quadratic sieve.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.





bug#32843: Feature request: rm -ir variant not asking about directories

2018-09-26 Thread Niels Möller
I have a large directory tree where most but not all directories are
empty, and where I might want to keep a few of the existing files.

I can use rm -ir to get rm to ask me for each file if it should be
deleted. But it also asks questions like

  rm: descend into directory 'foo'?
  rm: remove directory 'foo'?

to which I'd always say yes (and then attempts to delete any non-empty
directory fails with a clear warning message).

It would be less tedious if the questions about directories were
suppressed. A reasonable command line flag might be 

  --interactive=non-dir

(If there are any entries which are niether files nor directories, e.g,
a named pipe, I'd want rm to ask, hence "non-dir" rather than "file").

Another variant which would be useful is to traverse a directory tree
and recursively delete all empty directories, without asking any
questions. Would make sense as a --recursive/-r flag to rmdir, rather
than a new option to rm.

I'm using GNU coreutils 8.28, which doesn't seem to have these features.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.






bug#25135: Infinite loop in factor

2016-12-07 Thread Niels Möller
Hi,

I've got an interesting bug report saying that 

  factor 158909489063877810457

is very slow (actually, I don't think it ever terminates).

With below patch to gcd2_odd (the important part is checking if the 
<a1, a0> input is zero; deleting the unneeded handling of even <b1, b0>
makes sense but is not essential), factor terminates instantly,

  $ ./src/factor 158909489063877810457
  158909489063877810457: 3401347 3861211 12099721

Then there's another problem: It may happen that the first Pollard rho
attempt fails and produces only a trivial factorization. In this case,
factor (with the first fix applied) attemps to factor the number 1 and
crashes. The other patch, by Torbjörn, fixes this problem.

Input numbers of interest are 158909489063877810457 (above),
222087527029934481871 (same problem) and 12847291069740315094892340035
(second problem). 

I had a look at extending the test suite, but I gave up after spending
at least half an hour trying to find out how to run individual tests (I
had expected either ./tests/factor/t00.sh or make check
TESTS=tests/factor/t00.sh to do the trick, possible after also setting
RUN_VERY_EXPENSIVE_TESTS, but I couldn't get it to work).

Best regards,
/Niels

commit e4a7c55cc585c07358c00bff42a6ebf65f73136d
Author: Torbjörn Granlund <t...@gmplib.org>
Date:   Wed Dec 7 21:01:03 2016 +0100

factor: Retry properly if Pollard rho gives a trivial factorization

* src/factor.c (factor_using_pollard_rho): Handle trivial factor g = n.
(factor_using_pollard_rho2): Handle trivial factor g1 = n1, g0 = n0.

diff --git a/src/factor.c b/src/factor.c
index 115a635..54893ca 100644
--- a/src/factor.c
+++ b/src/factor.c
@@ -1522,6 +1522,13 @@ factor_using_pollard_rho (uintmax_t n, unsigned long int 
a,
 }
   while (g == 1);
 
+  if (n == g)
+{
+  /* Found n itself as factor.  Restart with different params.  */
+  factor_using_pollard_rho (n, a + 1, factors);
+  return;
+}
+
   n = n / g;
 
   if (!prime_p (g))
@@ -1607,7 +1614,7 @@ factor_using_pollard_rho2 (uintmax_t n1, uintmax_t n0, 
unsigned long int a,
 
   if (g1 == 0)
 {
-  /* The found factor is one word. */
+  /* The found factor is one word, and > 1. */
   divexact_21 (n1, n0, n1, n0, g0); /* n = n / g */
 
   if (!prime_p (g0))
@@ -1621,6 +1628,13 @@ factor_using_pollard_rho2 (uintmax_t n1, uintmax_t n0, 
unsigned long int a,
  to trigger.  Please be careful before you change this code!  */
   uintmax_t ginv;
 
+  if (n1 == g1 && n0 == g0)
+{
+  /* Found n itself as factor.  Restart with different params.  */
+  factor_using_pollard_rho2 (n1, n0, a + 1, factors);
+  return;
+}
+
   binv (ginv, g0);  /* Compute n = n / g.  Since the result will */
   n0 = ginv * n0;   /* fit one word, we can compute the quotient */
   n1 = 0;   /* modulo B, ignoring the high divisor word. */

commit 1bdf193895da010daf95260158c1eba5b9377b30
Author: Niels Möller <ni...@lysator.liu.se>
Date:   Wed Dec 7 18:50:20 2016 +0100

factor: Fix infinite loop in gcd2_odd

* src/factor.c (gcd2_odd): Fix the case a1 == 0, a0 == 0.

diff --git a/src/factor.c b/src/factor.c
index d271de9..115a635 100644
--- a/src/factor.c
+++ b/src/factor.c
@@ -480,10 +480,16 @@ gcd_odd (uintmax_t a, uintmax_t b)
 static uintmax_t
 gcd2_odd (uintmax_t *r1, uintmax_t a1, uintmax_t a0, uintmax_t b1, uintmax_t 
b0)
 {
+  assert (b0 & 1);
+
+  if ( (a0 | a1) == 0)
+{
+  *r1 = b1;
+  return b0;
+}
+
   while ((a0 & 1) == 0)
 rsh2 (a1, a0, a1, a0, 1);
-  while ((b0 & 1) == 0)
-rsh2 (b1, b0, b1, b0, 1);
 
   for (;;)
 {

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.






bug#16578: Wish: Support for non-native endianness in od

2014-02-09 Thread Niels Möller
Pádraig Brady p...@draigbrady.com writes:

 Attached in the patch I intend to push in your name.

Nice.

 I also added docs to usage() and the texinfo file, and added a test.

I don't quite understand how the test works, but as far as I see, it
doesn't test floats? So that's inconsistent with the commit message.

 BTW I checked if there was any speed difference with the new code.
 I wasn't expecting this to be a bottleneck, and true enough
 there is only a marginal change. The new code is consistently
 a little _faster_ though on my i3-2310M which is a bit surprising.

Odd. But performance of x86 is usually pretty hard to predict by just
looking at the source or assembly code. I was hoping that in the
non-swapped case, the false conditional

   if (input_swap  sizeof(T)  1)

should be very friendly to the branch predictor, and hence almost free.

Jim Meyering j...@meyering.net writes:

 One nit: please change the type of j here (identical in attached)
 to be unsigned, to match that of the upper bound.

Makes sense. In my own projects, I tend to use unsigned int for loop
counts whereever I don't need to iterate over any negative values. But
my impression is that most others prefer to use signed int for
everything which doesn't rely on mod 2^n arithmetic, so that's why I
made j signed here.

 That would be our first use of rev. Is it ubiquitous enough to depend on?

It appears *not* to be available on my closest solaris box. While on my
gnu/linux system, it's provided by util-linux. For the test, I guess rev
could be implemented something like

while read line
  printf %s line | tr -d '\n' | sed 's/./.\n/' | tac | tr -d '\n'
  echo
done 

Maybe rev should be provided by coreutils, similarly to tac? I'd prefer
not to think about the unicode issues for rev, though...

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#16578: Wish: Support for non-native endianness in od

2014-01-31 Thread Niels Möller
ni...@lysator.liu.se (Niels Möller) writes:

 Pádraig Brady p...@draigbrady.com writes:
 I agree this would be useful and easy enough to add.
 I suppose the interface would be --endian=little|big

 Maybe I can have a look at what it takes.

Below is a crude patch (missing: usage message, tests cases, docs,
translation). I think it should work fine for floats too. I see no
obvious and more beautiful way to do it. 

(And I think I have copyright assignment papers for coreutils in place,
since work on factor some year ago).

Regards,
/Niels

diff --git a/src/od.c b/src/od.c
index 514fe50..a71e302 100644
--- a/src/od.c
+++ b/src/od.c
@@ -259,13 +259,16 @@ static enum size_spec 
integral_type_size[MAX_INTEGRAL_TYPE_SIZE + 1];
 #define MAX_FP_TYPE_SIZE sizeof (long double)
 static enum size_spec fp_type_size[MAX_FP_TYPE_SIZE + 1];
 
+bool input_swap;
+
 static char const short_options[] = A:aBbcDdeFfHhIij:LlN:OoS:st:vw::Xx;
 
 /* For long options that have no equivalent short option, use a
non-character as a pseudo short option, starting with CHAR_MAX + 1.  */
 enum
 {
-  TRADITIONAL_OPTION = CHAR_MAX + 1
+  TRADITIONAL_OPTION = CHAR_MAX + 1,
+  ENDIAN_OPTION,
 };
 
 static struct option const long_options[] =
@@ -278,6 +281,7 @@ static struct option const long_options[] =
   {strings, optional_argument, NULL, 'S'},
   {traditional, no_argument, NULL, TRADITIONAL_OPTION},
   {width, optional_argument, NULL, 'w'},
+  {endian, required_argument, NULL, ENDIAN_OPTION },
 
   {GETOPT_HELP_OPTION_DECL},
   {GETOPT_VERSION_OPTION_DECL},
@@ -406,7 +410,21 @@ N (size_t fields, size_t blank, void const *block, 
 \
 {   \
   int next_pad = pad * (i - 1) / fields;\
   int adjusted_width = pad_remaining - next_pad + width;\
-  T x = *p++;   \
+  T x;  \
+  if (input_swap  sizeof(T)  1)  \
+{   \
+  int j;\
+  union {   \
+T x;\
+char b[sizeof(T)];  \
+  } u;  \
+  for (j = 0; j  sizeof(T); j++)   \
+u.b[j] = ((const char *) p)[sizeof(T) - 1 - j]; \
+  x = u.x;  \
+}   \
+  else  \
+x = *p; \
+  p++;  \
   ACTION;   \
   pad_remaining = next_pad; \
 }   \
@@ -1664,6 +1682,24 @@ main (int argc, char **argv)
   traditional = true;
   break;
 
+case ENDIAN_OPTION:
+  if (!strcmp (optarg, big))
+{
+#if !WORDS_BIGENDIAN
+  input_swap = true;
+#endif
+}
+  else if (!strcmp (optarg, little))
+{
+#if WORDS_BIGENDIAN
+input_swap = true;
+#endif
+}
+  else
+error (EXIT_FAILURE, 0,
+   _(bad argument '%s' for --endian option), optarg);
+  break;
+
   /* The next several cases map the traditional format
  specification options to the corresponding modern format
  specs.  GNU od accepts any combination of old- and




-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#16578: Wish: Support for non-native endianness in od

2014-01-30 Thread Niels Möller
Pádraig Brady p...@draigbrady.com writes:

 On 01/28/2014 12:54 PM, Niels Möller wrote:
 For the od program, it would be nice with a flag to specify the
 endianness for all types which are larger than a byte. Possible
 alternatives could be big endian, little endian, native endian.

 I agree this would be useful and easy enough to add.
 I suppose the interface would be --endian=little|big

Maybe I can have a look at what it takes.

 And for floats, besides endianness, it would be nice to be able to
 specify native format or ieee format, for systems where these are
 different.

 That's a bit less useful I think and harder to implement.

I agree that's a bit more obscure. So I understand if you don't want to do
that until there's some concrete usecase.

Endianness for float types should be easier, I hope.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#16578: Wish: Support for non-native endianness in od

2014-01-28 Thread Niels Möller
For the od program, it would be nice with a flag to specify the
endianness for all types which are larger than a byte. Possible
alternatives could be big endian, little endian, native endian.

And for floats, besides endianness, it would be nice to be able to
specify native format or ieee format, for systems where these are
different.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.






bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Niels Möller
Jim Meyering j...@meyering.net writes:

 The existing code can factor arbitrarily large numbers quickly, as long
 as they have no large prime factors.  We should retain that capability.

My understanding is that most gnu/linux distributions build coreutils
without linking to gmp. So lots of users don't get this capability.

If this is an important feature, maybe one should consider bundling
mini-gmp and use that as a fallback in case coreutils is configured
without gmp (see
http://gmplib.org:8000/gmp/file/7677276bdf92/mini-gmp/README). I would
expect it to be a constant factor (maybe 10) times slower than the real
gmp for numbers up to a few hundred bits (for larger numbers, it gets
much slower due to lack of sophisticated algorithms, but we probably
can't factor them in reasonable time anyway).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Niels Möller
Torbjorn Granlund t...@gmplib.org writes:

 There is one other place where some (hypothetical) portability problems
 may exist, and that's make-prime-list.c.  It prints a list of uintmax_t
 literals.

I don't think the prime sieving is not a problem, but for each (odd)
prime p, it also computes p^{-1} mod 2^{bits} and floor ( (2^{bits} - 1)
/ p), where bits is the size of an uintmax_t. This will break cross
compilation, if uintmax_t is of different size on build and host system,
or if different suffixes (U, UL, ULL) are needed in the generated
primes.h.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





bug#12350: Composites identified as primes in factor.c (when HAVE_GMP)

2012-09-07 Thread Niels Möller
Pádraig Brady p...@draigbrady.com writes:

 On 09/07/2012 09:43 AM, Niels Möller wrote:

 If this is an important feature, maybe one should consider bundling
 mini-gmp

 Bundling libraries is bad if one needed to update it.

mini-gmp is not an ordinary library. It's a single portable C source
file (currently around 4000 lines) implementing a subset of the GMP API,
and with performance only a few times slower than the real thing, for
small bignums. It's *intended* for bundling with applications, either
for unconditional use, or for use as a fallback if the real gmp library
is not available. It's never (I hope!) going to be installed in
/usr/lib. To me, coreutil's factor seem to be close match for what it's
intended for.

That said, mini-gmp is pretty new (I wrote most of it around last
Christmas) and I'm not aware of any application or library using it yet.
I think the guile hackers are considering using it (for the benefit of
applications which use guile as an extension language, but don't need
high performance bignums).

So if you decide to use it in coreutils, you'll be pioneers.

It *is* used in the GMP build process, for precomputing various internal
tables.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.





Documentation discrepancy for ls -f

2006-03-03 Thread Niels Möller
I this still the right place to send bug-reports for GNU ls? Or should
I use bug-coreutils instead?

I'm using ls (GNU fileutils) 4.0. The info documentation has the
following to say about -f:

`-f'
 Primarily, like `-U'--do not sort; list the files in whatever
 order they are stored in the directory.  But also enable `-a'
 (list all files) and disable `-l', `--color', and `-s' (if they were
 specified before the `-f').

while the message from ls --help says

  -f do not sort, enable -aU, disable -lst

This differs with respect to the -t and --color options. That -f
disables --color seems to match the implementation.

I think it would also be helpful to say something about what -f is
intended to be used for and why the option character 'f' (force) is
selected for that. My best guess is that it is intended to make the
operation as efficient as possible, disabling time consuming operation
such as sorting the list and stat()ing all the entries, but I really
don't know the background.

Regards,
/Niels


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils