from:"Niels Möller"

Re: mpz_t caching

2015-09-04 Thread Niels Möller

Vincent Lefevre <vinc...@vinc17.net> writes:

> In 2014, Patrick Pelissier (in Cc) implemented a mpz_t allocation
> cache for MPFR, redefining mpz_init and mpz_clear, in order to
> avoid some deallocations/allocations (via the indirect call to
> the allocation functions) when mpz_t's cleared and init'ed again
> a bit after. I've attached the patch that was applied to MPFR.

I take it this was done to improve performance? Do you have any
information on what applications benefitted, and some numbers for
typical performance improvement?

If I understand the patch correctly, the idea is essentially to keep
your own free-list for the limb storage pointed to by _mp_d? If you beat
libc's malloc, and by how much, is going to be quite system dependent.

> Note: A drawback is that it may be necessary to free the caches at
> the end of the program so that tools like valgrind don't complain.

And there are also thread-safety issues; currently GMP leaves all such
problems to the provided allocation function, while a cache would need
some thread-local storage (or locks, but I guess that's highly
undesirable if the idea is to gain performance).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Lazy mpz allocation

2015-09-28 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

>> I'd like a sort of "copy-on-write" interface, that allocates the needed
>
> I looked into the code and realised that my proposal requires an
> MPZ_REALLOC or if(ALLOC(x)==0) branch in ANY place where we write into an
> mpz_t. E.g. mpz_combit starts with:
>
>   if (limb_index + 1 < SIZ(x)) {
> PTR(x)[limb_index] ^= bit;
> return;
>   }

Good point. I'd say we postpone the "copy-on-write" thing for now. And
only support the following cases:

1. alloc > 0: Normal case, |size| <= alloc, reallocated as needed.

2. alloc == 0, size == 0: Initial lazy state, produced by mpz_init.
   Allocated on first write.

3. alloc == 0, size != 0: Allowed for input (read-only) mpz:s. Produced
   by mpz_roinit. Any attempt at writing the variable is invalid, and
   should call abort in all cases where checking for this case is cheap
   enough.

One corner case is what should be produced by mpz_roinit with a zero
input. After normalization we get size == 0. The intended meaning is
that of case (3), but if it is misused as an output variable, it will be
treated as case (2), and the allocated storage will leak. But I think it
is acceptable to have some random bad things happen on misuse.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Pull request: Raise SIGFPE instead of abort in __gmp_exception`

2016-01-07 Thread Niels Möller

Yichao Yu <yyc1...@gmail.com> writes:

> We (julia) catches the SIGFPE generated by GMP and raise our own
> error.

Out of curiosity, what are you doing on error? Display a message and
exit the process, or do you try to continue? At the very least, you
could suffer memory leaks, and perhaps leave other gmp objects in an
inconsistent state.

Maybe it would be simple and helpful to add a global error callback
pointer (where you could install a function looking up some thread-local
storage, if you want something thread-specific). But robust cleanup from
errors is quite tricky (it has been discussed on list from time to
time, one possibility is to combine it with custom memory allocators
with markers, and let the error handler longjmp out and clean up and
deallocate transient storage).

That said, it sounds reasonable to me to replace abort by raise(SIGFPE).
A configure check may be needed; I think both abort and raise are ANSI
C, but I suspect SIGFPE isn't.

Depending on pthreads functions is undesirable.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Pull request: Raise SIGFPE instead of abort in __gmp_exception`

2016-01-11 Thread Niels Möller

Yichao Yu <yyc1...@gmail.com> writes:

> Hmm. What about `#ifdef` on `SIGFPE` and `abort` otherwise. This
> should be a sane enough default for the even when the custom handler
> is added.

Sounds good enough. If noone objects, I'll make a patch to do that and
try the patch approval procedure at work.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

mpq_cmp

2016-05-26 Thread Niels Möller

t how and terminate.

  If (a == c || b == d) 
// A large gcd found
Compare the un-equal pair of elements and terminate.

// Compute full products. If practical, high limb first with early
// termination.
    return sgn (a d - b c)

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Register gmp with pkg-config

2016-06-15 Thread Niels Möller

Irina Gossmann <irina.gossm...@gmail.com> writes:

> Is there any reason for not including a .pc file with gmp? Since it's not
> registered with pkg-config, code depending on it won't build without some
> addition fiddling (I'm on Linux and OSX, not sure if/how it works
> differently on other platforms).

I guess the .pc file has to be generated by configure, to get file names
right. In a different project, I did it like this:
https://git.lysator.liu.se/nettle/nettle/commit/6e371fbfa7107fcd2f89e45693d099e6cd98a0a1

(There might be some helpers with libtool or automake; I don't know
those tools too well).

pkg-config support should also be documentated, somewhere close to the
section https://gmplib.org/manual/Autoconf.html#Autoconf (generated from
doc/gmp.texi).

> If you have a way of accepting contributions from outside, I can contribute
> a pkg-config file. Couldn't find a howto for contributors on the website,
> would've sent a patch directly otherwise.

Please send patches to this list. For non-trivial changes, we ask for a
Free Software Foundation copyright assignment.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Pull request: Raise SIGFPE instead of abort in __gmp_exception`

2016-01-28 Thread Niels Möller

Yichao Yu <yyc1...@gmail.com> writes:

> If I read the patch[1] correctly, the divide by zero is nor performed
> if SIGFPE exists.

You're right, in that case it skips the divide by zero and goes directly
to abort.

>> It's possible to add som OSX workaround, say
>>
>>   #if defined (SIGFPE) && defined (MAC_SOMETHING)
>>
>> But I'd prefer to do that only if there's any real problem. After all,

> There is.

Can you suggest and test a workaround of that form?

> I think the right way is to conditionally (on osx) use `pthread_kill`
> (and link to pthread).

A link dependency on pthread is highly undesirable. *Maybe* if it can be
hidden from users.

> I think another difference that worth clarifying between what we want
> and what is currently assumed in gmp is that we don't really want the
> program to terminate if there's some arithmetic error.

Using a signal handler for that sounds brittle to me. I would strongly
recommend checking for invalid operands (like, if (mpz_sgn(d) == 0)
julia_divide_by_zero_error()) before calling gmp functions which might
divide by zero. Allocation failures are a differnt a more difficult
matter, but there it's probably better to longjmp out of the allocation
function than to raise a signal.

> Of course, using a callback to report the error would be even better.
> However, as you (or someone else) mentioned, having such API would
> probably imply that GMP should still be in a consistent state after an
> error is thrown (and handled)

But you have very similar consistency issues when using a signal
handler. I'm afraid it's going to be a bit brittle. Or are you using
some thread local state in the allocation functions, so you can
deallocate temporary storage when terminating the thread?

> Adding such API and refine it later would still be strictly
> better than the current situation though.

To me it would make sense with something like

  mp_set_error_handler (void (*handler)(enum gmp_error err));

where the current __gmp_exception is the default handler. Still tricky
to safely use a handler which doesn't terminate the program, but at
least cleaner and more portable than using signal handlers. In
particular, mixing signals and longjmp really makes me uneasy...

Not sure what the other developers think?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Pull request: Raise SIGFPE instead of abort in __gmp_exception`

2016-01-28 Thread Niels Möller

Yichao Yu <yyc1...@gmail.com> writes:

> However, if I understand correctly, this may cause a regression on OSX
> x64 for multi-threaded programs since the signal may be delivered to
> any thread instead of the current thread.

I don't fully understand the consequences, but if raise ever returns for
any reason, __gmp_exception will go on and divide by zero, and if that
doesn't cause program termination, it will next call abort.

Posix seems pretty clear that raise must deliver the signal to the
thread executing raise.
http://pubs.opengroup.org/onlinepubs/009695399/functions/raise.html. And
the C standard says at least that "If a signal handler is called, the
raise function shall not return until after the signal handler does."
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1570.pdf 

It's possible to add som OSX workaround, say 

  #if defined (SIGFPE) && defined (MAC_SOMETHING)

But I'd prefer to do that only if there's any real problem. After all,
the typical action is that the program is terminated, and if a signal
handler is installed, it typically just adds a friendlier error message
before termination.

> Unconditionally trying the old approach first should solve this issue.

You mean, first try divide by zero, and in case that doesn't crash the
program, raise a signal? If we want the specific signal SIGFPE, then I
think using raise is preferable, because it has a reasonably well
defined meaning in C. While division by zero is undefined behaviour.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Pull request: Raise SIGFPE instead of abort in __gmp_exception`

2016-01-28 Thread Niels Möller

Yichao Yu <yyc1...@gmail.com> writes:

>> Patch checked in now.
>
> I guess I'm not familiar with hg/gmp work flow. Is checked in
> different from commited to the repo? I don't see the change appearing
> in the repo[1] and my local fresh clone doesn't have it either.
>
> [1] https://gmplib.org/repo/gmp/

Maybe there's something wrong with the mirroring to the public repo?

It's missing my checkin and a previous one of Torbjörn's, compared to
what I see with ssh access to shell.gmplib.org/var/hg/gmp.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: getrusage vs clock

2016-01-21 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I believe we already use clock_gettime(2) directly on GNU/Linux, which
> is the underlying syscall that clock(3) uses in newer GNU libc versions.

Last time I looked, one had to run speed with -o cycles-broken to force
using clock_gettime.

Maybe it would make sense to use cycle counter and clock_gettime
together? But I have no idea how the "supplemented by" logic works.

The problem with the cycle counter on linux is that it isn't saved and
restored on context switch. clock_gettime is the supported interface for
getting high precision per-process cpu time. 

And it's news to me that the old clock function uses the same timers in
recent glibc. But it sounds nice. What's the value of CLOCKS_PER_SEC?
Does it match the actual resolution?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_redc_n

2016-03-14 Thread Niels Möller

paul zimmermann <paul.zimmerm...@inria.fr> writes:

> at the request of Niels I re-raise the issue at
> https://gmplib.org/list-archives/gmp-devel/2015-March/003945.html,
> i.e., the fact that redc_1 and redc_2 do not have the same convention
> than redc_n.

When is a good time to try to merge my old bdiv changes? I don't even
recall the details, but IIRC, the log message is "New bdiv convention, R
B^{qn} = U + Q D", and I think the bdiv return value is also the carry
out from that addition.

In ~nisse/hack/gmp-bdiv on shell.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Move -DNO_ASM to config.h?

2016-03-20 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> when configure wants to define a macro, it usually puts it in
> config.h. We do have one exception: NO_ASM, which ends up defined in
> CFLAGS. Was there a particular reason for this choice?

I'm greping for its uses, and it is used only in C files as far as I
can see. 

Moving it to config.h makes sense to me. If we move it, we have make
sure that config.h is always included before longlong.h. Most files get
config.h via gmp-impl.h. So, e.g., #include "longlong.h" followed by
#include "gmp-impl.h" would break. 

I think the autoconf manual recommends an explicit include of config.h
(guarded by HAVE_CONFIG_H) first in every C file. We don't have to do it
that way of course, just be aware that we depart from that convention.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Lazy mpz allocation

2016-05-04 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I forgot about that we need to have an explicit denominator.
> We could surely point that to some static read-only data, but that would
> of course incur a cost when "re-allocating".

I'm not familiar with mpq internals, but I guess it might be possible to
get away with leaving the denominator as zero on initialization.

Then one would need to follow the convention that if numerator is zero,
then the denominator is ignored. Or (but I guess that's more cumbersome)
adapt the convention that zero denominator means "this is an integer",
and treated as one.

/nisse

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_invert_limb

2016-07-15 Thread Niels Möller

paul zimmermann <paul.zimmerm...@inria.fr> writes:

> would it be possible to add mpn_invert_limb in the public interface?
> It would be very useful for GNU MPFR.

Makes sense to me, it's a useful and well-defined operations, and
discussed at length in various papers. So should be quite easy to
document.

As far as I understand, it's not always compiled into the library,
though. Internally the code uses the macro invert_limb, which expands to
a call to mpn_invert_limb only if there exists a native implementation,
otherwise, the macro expands to a call to udiv_qrnnd.

Not sure if there's any problem with always making mpn_invert_limb a
real function and call it explicitly. In the cases where it expands to
udiv_qrnnd, it's probably so slow that an extra level of function call
shouldn't matter much.

Looking at longlong.h, there appears to be some cases where udiv_qrnnd
is in turn defined as mpn_invert_limb + udiv_qrnnd_preinv (not entirely
clear when that might happen, seems possible only for alpha, arm, cray,
and ia64). And there's some room for cleanup in longlong.h, for example
a few #if 0, and some constants (UMUL_TIME and UDIV_TIME) which are no
longer used, as far as I can see, and most recently mentioned in
ChangeLog back in 2002.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Add mpz_inp_str to mini-gmp

2016-07-08 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I haven't looked at mini-gmp's mpz_set_str, but I assume it is O(n^2),
> so not much is lost by using a trivial implementation like the one
> above.

It tries to be O(n) when base is a power of two. Which I think is a
reasonable tradeoff. I was a bit surprised and disappointed when I found
out (a few years ago, no idea if it's still the case) that the base-256
conversion in libtomcrypt, used by the dropbear ssh implementation which
is popular on embedded systems, was O(n^2). And one use-case for
mini-gmp is constrained embedded systems.

> (I don't think we should accomodate spaces in the numbers, since that's
> considered a mis-feature of the main GMP.)

Then the documentation should perhaps be more explicit and discourage
use of that feature.

Speaking of documentation, the docs for mpz_inp_str isn't clear at all
on what input characters are considered to terminate the input.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Add mpz_inp_str to mini-gmp

2016-07-08 Thread Niels Möller

Austyn Krutsinger <akrutsin...@gmail.com> writes:

> I think the approach is simple enough, but the possible
> downside is that the entire string has to be read into memory before being
> passed to mpz_set_str,

That's no big deal for mini-gmp, it's not intended to be used for huge
numbers, or make a lot of effort to be as efficient. Which of course is
a judgement call.

Austyn Krutsinger <akrutsin...@gmail.com> writes:

> According to current documentation mpz_set_str
> <https://gmplib.org/manual/Assigning-Integers.html#index-mpz_005fset_005fstr>
> allows spaces within the string and mpz_inp_str
> <https://gmplib.org/manual/I_002fO-of-Integers.html#I_002fO-of-Integers>
> only allows preceding white space. Torbjörn, had mentioned whitespace in an
> inputted string is a mis-feature in the main GMP. Just curious what the
> intended function of GMP actually is with respect to strings.

I'm not sure how mpz_inp_str is intended to behave. If it's acceptable
to read everything until white-space or EOF is encountered, that makes
things a lot simpler. 

After a quick look at the code (in mpz), my understanding was that it
should terminate at the first non-digit. I.e., for a file containing
0123456789, calling mpz_inp_str with base 10 would read the entire file,
but calling mpz_inp_str with base 0 (or 8) would read until it sees the
'8', ungetc that character, and convert only the digits preceding '8'.

> Additionally, the 6.1.1 documentation says that the mpz_str_ functions
> should allow for bases from 2 to 62. Currently mini-gmp only allows for
> bases from 2 to 36. If there is any intent on allowing mini-gmp to support
> strings with base from 2 to 62, I will write up a patch and submit for
> review.

Extending support up to base == 62 sounds reasonable to me.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Add mpz_inp_str to mini-gmp

2016-07-09 Thread Niels Möller

Austyn Krutsinger <akrutsin...@gmail.com> writes:

> As I have it now, the only code that is duplicated between my mpz_inp_str
> and mpz_set_str is the removal of any leading white spaces between in the
> string.

I think that duplication is fine. One could add a helper function to eat
whitespace, but I don't think that really makes things simpler.

I'd like to hear Torbjörn's opinion on using white space as terminator,
i.e., reading a file containing "12345678 " with base 8 will result in
an error, which I suspect differs from the behavior of mpz_inp_str in
the real gmmp.

Some more detailed comments further below.

> I have updated mini-gmp to support up to base 62 also.

Please do this as a separate patch.

> I'm really not sure of the optimal way to
> allocate then realloc memory for the input buffer. I suppose some kind of
> statistical analysis of the size of strings read in from file streams would
> be ideal, but since I don't have such statistics, I have arbitrarily chosen
> the sizes you see.

I think typical cases will be quite small. I'd suggest an initial size
of 100 characters, and then increasing the size by a factor of 3/2
at a time. Possibly with some arbitrary upper limit to fail in a
predictable way for a malicious input file.

> int
> mpz_inp_str (mpz_t x, FILE *stream, int base)
> {
>   int c;
>   size_t nread = 0;
>   size_t size = 1000;
>
>   if (stream == 0)

Either "!stream",  or "stream == NULL".

> stream = stdin;
>
>   char *buf = (char *) gmp_xalloc (size);

The (char *) cast is unnecessary (unless we try to support compilation
as C++, do we?).

>   /* Skip leading whitespace */
>   do
> {
>   c = getc (stream);
> } while (isspace (c));
>
>   /* Read input until end of file or a space character */
>   while ((c != EOF) && (isspace (c) == 0))

I think  "!isspace (c)" is clearer.

In in the case c == EOF, I think one ought to also have an

  if (ferror (stream)) 
{
  gmp_free (buf);
  return 0;
}

Not sure how the real gmp handles i/o errors in mpz_inp_str, but
returning success seems dangerous.

> {
>   if (nread >= size - 1)

This looks prone to off-by-one errors (even if I believe it is correct).
Maybe let size be the maximum number of digits, with an actual
allocation of size+1 to accomodate the terminating NUL character?

> {
>   /* Increase input buffer size */
>   size_t old_size = size;
>   size += 100;

I think a multiplicative increase is appropriate. E.g., "size = 3*size/2;".

>   buf = (char *) gmp_xrealloc (buf, old_size, size);

You don't need to bother with passing old_size, mini-gmp is documented
to always pass 0 for this argument (see mini-gmp/README). BTW, the
restriction to base <= 36 is also documented there.

> }
>
> buf[nread++] = c;
>
>   c = getc (stream);
> }
>
>   buf[nread] = '\0';  /* Ensure null terminated string */

"NUL terminated".

>   ungetc (c, stream);

Not sure if ungetc is safe when c == EOF?

>   c = mpz_set_str(x, buf, base);

Don't reuse the c variable, use a separate int res.

>   gmp_free (buf);
>   /* 0 on error else number of characters read excluding null-terminator */

Drop " excluding...", or change "null-" to "NUL ".

>   return (c == -1 ? 0 : nread);
> }

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Add mpz_inp_str to mini-gmp

2016-07-06 Thread Niels Möller

Austyn Krutsinger <akrutsin...@gmail.com> writes:

> I am not sure if there was a reason FILE streams were not implemented
> before, but if there is I'd be interested in hearing why.

mpz_out_str is included in mini-gmp, so it would make sense to also
include the companion function mpz_inp_str.

However, for mini-gmp, we need something considerably simpler than the
code in mpz/inp_str.c. Which isn't entirely trivial. Ideally, it should
just read the number and pass it on to mpz_set_str. But to detect the
end of the number, it first has to parse any base prefix (like "0x"),
and then look for the first non-digit. And it's no good to have a lot of
code duplication with mpz_set_str.

Without thinking very long about it, maybe one could get something
reasonable by extracting the digit logic of mpz_set_str to a helper
function. We could also consider omitting support for base == 0 for the
mini-gmp version of mpz_inp_str, if that makes things simpler.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: increase mini-gmp base support to 62

2017-02-03 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> Applied. But I changed again the code that checks the base. I used the
> following:
>
>  digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
>  if (base > 1)
>{
>  if (base <= 36)
>digits = "0123456789abcdefghijklmnopqrstuvwxyz";
>  else if (base > 62)
>return NULL;
>}
>  else if (base >= -1)
>base = 10;

It's documented that base = 0 means base 10. But should |base| = 1 also
imply base 10? Would make more sense to me to treat that as an error.

>  else
>    {
>  base = -base;
>  if (base > 36)
>return NULL;
>}

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_sqrtrem{1,2}

2017-02-01 Thread Niels Möller

Adrien Prost-Boucle <adrien.prost-bou...@laposte.net> writes:

> Maybe the availability of SSE / AVX / NEON etc instruction sets can be
> checked at compilation time?

That's what configure (and its helper scripts) does.

With --enable-fat, we also have runtime detection on certain systems.

> The ASM version would be very easy to obtain:
> compile sqrtrem1 and sqrtrem2 (an FP implementation) on the right
> machine and keep the ASM.

Or if these are small and easy functions, write them by hand.

> There would be no dependency on libm.
> How difficult would it be to add such checks in GMP code?

It's not that hard to add new assembly files. You need to know some
assembly, of course. And be aware that gmp uses m4 to preprocess
assembly source files.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: increase mini-gmp base support to 62

2017-02-01 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> But the proposed patch adds constants that enlarge (too much in my
> opinion) both the source and the binary of this small-size implementation.

I also prefer the conciser version without the large table.

> I attach an alternative patch. Please revise it.

Looks good. Only detail I don't quite like is the variable name
"lowercaseoffset". Maybe "value_of_a" (since it's the value of the
digit 'a')? Or at least add an underscore or two, lowercase_offset.

Thanks,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Problem with gmp_randinit_set

2017-02-19 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> The problem is that Niels' code, mine, and the current mpz code... do all
> "reduce" modulo (2^19937-20023) obtaining some non-canonical
> representation.

I was thinking that one should convert to canonical representation at
the end of the powering. But if the current code doesn't do that,
staying compatible would need extra care. And it rules out the simple
way of just using mpn_powm.

(And then I realized that one has to produce a canonical representation
also for the initial reduction, since that uses a different modulo, p-4).

> If we "do not want to generate different sequences than earlier GMP", we
> will have to mimic current behaviour, even in the corner cases...

I think that should be doable, if we want to. Without thinking too
deeply about it, it looks like your variant with shift should be
equivalent to the current mpz code, maybe mine too (it essentially
combines the shift with the addmul_1 call).

> Is it worth doing?

I think it's desirable to eliminate the dependency on mpz. How important
it is not change the seed --> sequence mapping, I don't know.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Recent changes to mpn_get_str/mpn_set_str

2017-02-15 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I am not sure sliding-window powering makes much sense for powering in Z

It could reduce the mumber of mpn_mul_1 calls, but (as always) leave the
number of squarings unchanged.

> (as opposed to in a ring).

But Z *is* a ring ;-)

> For radix conversion, we arrange things so that the bases are never
> small, so the plain mpn_pow_1 should be very hard to beat.

Ah, then sliding window makes less sense, since there will be no
interesting single-limb powers to tabulate. And we probably don't want
so large table entries that mpn_mul wins over repated mul_1.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz reuse test takes too much time

2017-01-18 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

>> Now we have two possible strategies:
>>
>> 1. Call mpn_gcdext (B, A), which produces T. And compute S as
>>
>>  S = (G - T B) / A
>
> This means computing both cofactors, right?

Yes.

>> 2. First do a division up front,
>>
>>  B = Q A + R
>>
>>and call mpn_gcdext(A, R). We then get a cofactor S', of m limbs. But
>>we also need the other cofactor T', because the wanted cofactor S is
>
> This also means computing both cofactors, don't it?

I think you're right. We compute S' and T' (gcdext after initial
division), and then construct S. So I didn't say that we compute T. But
in fact, T' = T, since T = (B/G)^-1 (mod A/G), T' = (R/G)^-1 (mod A/G),
and B = R (mod A). Right?

> I assume that the best way to compute both cofactors is the one currently
> implemented in mpz_gcdext.

It's best under the theory think that a few large multiplications at the
end is more efficient than building up the other cofactor incrementally.
But might not be true in all cases...

But we should probably reorganize mpn_gcdexp first, before evaluating
other strategies regarding the larger cofactor. The divide-and-conquer
thing should be done slightly different, and that might make it
cheaper to get the other cofactor at least for large operands.

>> And a third strategy could be to extend mpn_gcdext to support A < B, and
>> hence compute the larger cofactor directly. But I'd guess that would be
>> more work, since gcdext uses a quadratic algorithm for much larger sizes
>> than for multiplication and division.
>
> Actually mpn_gcdext supports A<B, but it requires to zero-pad A so that
> its length in limbs is not smaller than the length of B.

Maybe the way mpn_gcdext handles this can be more or less unchanged. But
we shouldn't have to zeropad the input.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: GMP work on symbol visibility

2016-09-03 Thread Niels Möller

ni...@lysator.liu.se (Niels Möller) writes:

> Is this 6 year old post still valid? 

Hmm, here's a related gcc bug with is marked as fixed,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51880, also marked as a dup
of the old bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19520 Ian
refers to.

> I guess it would also be useful with some function __attribute__
> expressing that "I don't give a damn whether or not pointer comparison
> for address(es) to this function work according to the C standard".

Anyone else who thinks that would be a good idea? I'm considering filing
a bug report/feature request to gcc.

Anyway, as to using protected, it seems it's a bit too hairy for its
own good. 

But why doesn't symbol aliases with hidden visibility have the same
issues?

Say we have an internal symbol mpn_foo_internal with visibility
"hidden", and an alias mpn_foo with visibility "default". Then I think
the desirable behaviour is that pointers to mpn_foo and mpn_foo_internal
are equal if and only if the executable doesn't provide a different
implementation of mpn_foo. But I have no idea how that really works. And
since symbol aliases aren't C standard, one could argue that the
standard's rules for pointer equality don't apply.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: GMP work on symbol visibility

2016-09-01 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> There are several complex issues I need to understand and resolve.  One
> issue is how to handle (unit) testing of hidden functions.

Do we have a lot of functions which (1) you'd want to hide, and (2)
there are unit tests for?

I see two simple solutions: Either do those unittests only against the
static library. Or use visibility "protected".

As I understand it, "protected" does about what we want: Makes sure
internal references don't go via PLT / and GOT. There are some
drawbacks: It doesn't shrink the symbol table, which would be an issue
if we have lots of "protected" symbols, and IIRC, I think Ian said that
glibc's ld.so was poorly optimized for this case, so protected symbols
could make library load slower than using "default" visibility.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: [PATCH] Add pkg-config support

2016-08-30 Thread Niels Möller

Michael Catanzaro <mcatanz...@gnome.org> writes:

> Right... using AC_CHECK_LIB has been considered primitive for a long
> time now. I was really surprised to see GMP doesn't already have pkg-
> config file. I can't imagine why you wouldn't want to support it, since
> it's just one file to install.

I don't say we don't want to support it, even if I'm not as enthusiastic
about it as you are.

Also, in my opinion, the main reason to have pkg-config is to make
things simpler for projects which are *not* using autoconf, but just want
to add CFLAGS=`pkg-config bla bla` directly in their Makefile or build
script.

> I took a quick peek at the documentation format but it doesn't look
> straightforward to update, 

Source format is texinfo, in the file doc/gmp.texi. If you want to work
with GNU packages, you should get at least somewhat familiar with it.

> It's optional. One popular convention is that you never break API
> without changing the name of the pkg-config file, and adding an API
> version there accomplishes that.

Are conventions and best practices documented somewhere?

> I like it since it's a nice way to ensure
> application developers know an API break has occurred, if API breaks
> are rare (as they seem to be for GMP).

That's tricky to make useful, you have to think carefully about what the
number means. There's ABI breaks, API breaks, and compatible extensions,
all tricky to capture in a single number. I think I'd prefer to not have
any version number here. And if a program needs to test for a particular
gmp feature, I *strongly* recommend doing traditional autoconf-style
functional tests for the features that are actually needed.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: GMP work on symbol visibility

2016-09-01 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> We should never forget that the most important aspect of our tests is to
> catch miscompilatilons.

If we build static and shared libraries from the same object files (I
suspect automake and libtool doesn't do that for us now), testing the
static library should detect miscompilations. But not mislinking,
including broken link-time optimizations.

And even if we don't do that, I would hope that the risk that a
miscompiled object file is caught by unit tests of internal functions,
but not by the tests of public gmp functions, is pretty low.

To make an informed decision, I think it would help to compile a rough
list of the functions to hide.

> http://www.airs.com/blog/archives/307

Is this 6 year old post still valid? The runtime linker could build some
index to avoid extra walks over symbol tables. I guess it would also be
useful with some function __attribute__ expressing that "I don't give a
damn if pointer comparison for address(es) to this function work
according to the C standard".

> I wonder if one could edit the .so file and make hidden symbols visible
> again.  Perhaps objcopy knows how to do that?

The point of hidden is to completely remove the symbols from the symbol
table in the shared object, right? So to get them back, I guess one
would need to reconstruct GOT and PLT entries either from the object
files or from debug info. Perhaps one can relink the .so file with some
linker flag saying "ignore hidden"? Or simply collect the object files
into a static library.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: [PATCH] Add pkg-config support

2016-08-30 Thread Niels Möller

Michael Catanzaro <mcatanz...@gnome.org> writes:

> It would be desirable for gmp to install a pkg-config file. This makes
> it much easier for other projects to link to gmp, especially projects
> that use Autotools build systems.

The GMP manual includes the recommended autoconf check. See
https://gmplib.org/manual/Autoconf.html#index-Autoconf. If we want to
support, or even encourage, pkg-config, this should be documented.

> diff -r 31d28753a4e1 -r 95850e2b3e2e Makefile.am
> --- a/Makefile.am Thu Jun 02 15:34:47 2016 +0200
> +++ b/Makefile.am Mon Aug 29 08:33:58 2016 -0500
> @@ -113,6 +113,8 @@
>  EXTRA_DIST = configfsf.guess configfsf.sub .gdbinit INSTALL.autoconf \
>COPYING.LESSERv3 COPYINGv2 COPYINGv3
>  
> +pkgconfigdir = $(libdir)/pkgconfig
> +pkgconfig_DATA = gmp-6.pc

I find the "-6" odd. By which convention should there be a version
number in there?

> --- /dev/null Thu Jan 01 00:00:00 1970 +
> +++ b/gmp-6.pc.in Mon Aug 29 08:33:58 2016 -0500
> @@ -0,0 +1,11 @@
> +prefix=@prefix@
> +exec_prefix=@exec_prefix@
> +libdir=@libdir@
> +includedir=@includedir@
> +
> +Name: gmp
> +Description: A free library for arbitrary precision arithmetic
> +URL: https://gmplib.org/
> +Version: @VERSION@
> +Libs: -L${libdir} -lgmp
> +Cflags: -I${includedir}

A curious question: If and how does the pkg-config machinery add linker
flags like -R or -Wl,-rpath, ? These are needed if using shared
libraries installed in some location not included in the default ld.so
search path. E.g., say I configure gmp with --prefix=/pkg/gmp/6.1.1/x86_64,
without adding the corresponding lib directory to /etc/ld.so.conf or
similar.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Flurry of GMP check failures

2016-11-06 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> But this is odd: At least Debian (didn't have a chance to check Gentoo
> yet) does not ban R_X86_64_64 which we use frequently from GMP's
> assembly for references to tables of constants.  It bans R_X86_64_32S
> which we use for tables of function pointer offsets.

Hmm, didn't openbsd pioneer this pie-by-default thing some year ago, and
there was a similar oddness with supported reloc types?

> To me, this discrepancy makes very little sense; either the resulting
> user binaries will not be PIE or they will require load-time reloc
> patching.

IIRC, the motivation was that a 64-bit reloc is large enough to *always*
be pathchable to point to the right place at load time, while a 32-bit
reloc might be too small, depending in the ultimate location of the
target. (If I haven't misunderstood what these relocs are; I have a very
fuzzy understanding of the details).

And if the code is expected to be pic and sharable at loadtime, neither
reloc type should be used in the text or rodata segment; while
R_X86_64_64 is perfectly fine for tables placed in the data segment.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Flurry of GMP check failures

2016-11-05 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> This fails:
>   $ gcc sym.c -c -fno-pic && gcc sym.o m.c

I guess we never need to pass -fno-pic to gcc, we should either pass
-fpic, when building shared library, or no pic-reaetd flags at all, when
building static libraries and executables. Right?

Problem is that when -fpic is on by default, assembly files need to use
pic conventions too. And to get this right automatically we'd need
configure to detect whether or not pic is on by default.

Or change the x86_64 assembly files to always use pic. Is there any
penalty to that? My understanding is that pic is somewhat of a pain with
32-bit x86, due to lack if addressing modes using pc + offset, but quite
nice and easy on x86_64.

> I'm migrating from Debian to Gentoo for GMP's testing, so unless
> Debian's approach is gaining momentum in the GNU/Linux world, we could
> just ignore this.

If debian is following ubuntu, I think we have to say there already is
some considerable momentum at work.

For reference, in Nettle, everything is built as pic by default since
the 1.11 release back in 2004. The motivation back then was the use case
that an interpreter, e.g., pike, loading it's crypto support code
dynamically using ldopen("Nettle.so"), and the corresponding .so file
then linked statically to libnettle.a. To get a non-pic static library,
one has to configure with --disable-shared --disable-pic, but I haven't
heard about anyone actually doing that.

At this point in time, I think use-cases involving static libgmp.a (or
libnettle.a) are somewhat obscure.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Flurry of GMP check failures

2016-11-05 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> 2. Something is broken wrt Debian sid from the past few weeks.
>It seems they want static lib code to be PIC.

Or rather, PIC code for all executables, to enable randomization of the
mapping of the text segment into the process address space. I think the
rationale is documented here: https://wiki.debian.org/Hardening.

Changes to default gcc seem to be described at 
https://wiki.debian.org/Hardening/PIEByDefaultTransition

It's not entirely clear if and how it's intended to affect builds which
don't involve debian-specific tools, maybe something is broken.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Flurry of GMP check failures

2016-11-05 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> Uh? I have 2 very natural use cases for libgmp.a:
>
> 1) to build a static (or at least with few dependencies) executable
> that I am going to send to someone else [...]

> 2) I am rebuilding GMP on my specific platform in order to get the
> best possible performance. [...]

I see. In these cases, is it important whether or not libgmp.a uses pic
code?

Regards,
/Niels



I might as well use a static libgmp to
> squeeze that extra bit of speed out of it (I might even want to use
> LTO in that case, although IIRC it currently causes some trouble in
> configure).
>
> Note that I do not care about security in either case.

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz_gcd_ext(NULL, ...)

2016-11-25 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I suppose it also implies a slight computation overhead
> to check size + low limb for the return value.

Slight enough to be reasonably for the mpz-layer, I think. 

One could consider doing that only when the gcd pointer is NULL, and
otherwise return some arbitrary value. But that's not a pretty interface
to document...

> The type of the gcd function change, which breaks e.g. user function
> pointers to these functions.

You're right, I forgot about that use case. 

> This sort of things probably below here:
> https://gmplib.org/devel/incompatibility.html

Agreed, and it's not obvious what's the right interface either.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz_gcd_ext(NULL, ...)

2016-11-25 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> This sort of things probably below here:
> https://gmplib.org/devel/incompatibility.html

Done. I also added an entry for _mpz_newalloc, while editing that file.

www-update failed, though:

  [nisse@shell /var/hg]$ ./www-update 
  Checking out www.
  destination directory: www
  updating to branch default
  201 files updated, 0 files merged, 0 files removed, 0 files unresolved
  Compressing .html and .txt. This can take a minute.
  ssh: connect to host www.gmplib.org port 22: Operation timed out
  rsync: connection unexpectedly closed (0 bytes received so far) [sender]
  rsync error: error in rsync protocol data stream (code 12) at io.c(226) 
[sender=3.1.2]

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-21 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Of the remaining https://gmplib.org/devel/mini-gmp-status.html issues I
> worry most about #2.  Marco adjusted the parameters to make it faster,
> but I remain unconvinced that g5.gmplib.org-dyn:32 really needed 400
> seconds for the test with old parameters.  I suspect the code can hang
> for certain seeds.  (We cannot tell the seed because of #1.)

If it happens again, the seed should be printed out.
https://gmplib.org/repo/gmp/rev/5abbd164e2a3

> I am not sure #8 (alpha/dupont) is for real.

I'll try to log in to alpha-gentoo and reproduce.

> #15 is strange, I haven't tried to understand why these libgcc link
> errors happens for just certain similar configs.

The direct links to examples are dead. Do these system normally have
LD_LIBRARY_PATH set? If so, maybe something wrong when gmp:s .lib is
prepended to the path.

> I am glad the "red" bugs had one cause (although a serious one).  It
> would have been much worse with many errors of this kind.  (Well, the
> results for end users are same same, but it matters for how to assess
> code quality.)

When the dust has settled, we'll have to think about whether we should
make a "mini-gmp release" with bug-fixes, or just send out an
announcement.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

New mini-gmp macro GMP_CMP?

2016-11-22 Thread Niels Möller

I was looking at how to simplify mini-gmp's mpz_cmp_si. I considered
using mpz_cmpabs_ui to handle the case of both inputs negative, which
made it more consice but perhaps not clearer. Then I came up with the
following. 

The pattern (un == 0) ? 0 : u->_mp_d[0] and variants are replaced by
calls to mpz_get_ui which does the same thing, and the pattern (a > b) -
(a < b) is replaced by a GMP_CMP macro.

Resulting context patch below, reducing the code size by a dozen of
lines. Do you think this is also an improvement in clarity?

/Niels

*** /tmp/extdiff.RPnb3V/gmp.afda1bbf3ee3/mini-gmp/mini-gmp.c2016-11-22 
09:18:29.455237339 +0100
--- /home/nisse/hack/gmp/mini-gmp/mini-gmp.c2016-11-22 08:54:54.290548487 
+0100
*** see https://www.gnu.org/licenses/.  */
*** 70,73 
--- 70,75 
  #define GMP_MAX(a, b) ((a) > (b) ? (a) : (b))
  
+ #define GMP_CMP(a,b) (((a) > (b)) - ((a) < (b)))
+ 
  #define gmp_assert_nocarry(x) do { \
  mp_limb_t __cy = (x);\
*** int
*** 1778,1784 
  mpz_sgn (const mpz_t u)
  {
!   mp_size_t usize = u->_mp_size;
! 
!   return (usize > 0) - (usize < 0);
  }
  
--- 1780,1784 
  mpz_sgn (const mpz_t u)
  {
!   return GMP_CMP (u->_mp_size, 0);
  }
  
*** mpz_cmp_si (const mpz_t u, long v)
*** 1795,1805 
  return 1;
else /* usize == -1 */
! {
!   mp_limb_t ul = u->_mp_d[0];
!   if ((mp_limb_t)GMP_NEG_CAST (unsigned long int, v) < ul)
!   return -1;
!   else
!   return (mp_limb_t)GMP_NEG_CAST (unsigned long int, v) > ul;
! }
  }
  
--- 1795,1799 
  return 1;
else /* usize == -1 */
! return GMP_CMP (GMP_NEG_CAST (mp_limb_t, v), u->_mp_d[0]);
  }
  
*** mpz_cmp_ui (const mpz_t u, unsigned long
*** 1814,1821 
  return -1;
else
! {
!   mp_limb_t ul = (usize > 0) ? u->_mp_d[0] : 0;
!   return (ul > v) - (ul < v);
! }
  }
  
--- 1808,1812 
  return -1;
else
! return GMP_CMP (mpz_get_ui (u), v);
  }
  
*** int
*** 1837,1849 
  mpz_cmpabs_ui (const mpz_t u, unsigned long v)
  {
!   mp_size_t un = GMP_ABS (u->_mp_size);
!   mp_limb_t ul;
! 
!   if (un > 1)
  return 1;
! 
!   ul = (un == 1) ? u->_mp_d[0] : 0;
! 
!   return (ul > v) - (ul < v);
  }
  
--- 1828,1835 
  mpz_cmpabs_ui (const mpz_t u, unsigned long v)
  {
!   if (GMP_ABS (u->_mp_size) > 1)
  return 1;
!   else
! return GMP_CMP (mpz_get_ui (u), v);
  }
  

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-22 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Venerable dupont (alphaev56 gentoo) seems to pass mini-gmp now.  Let's
> see if the previously consistently failing mini-gmp t-cmp_d now
> consistently passes; then I think we can assume this to be fixed.

Are there any of the changes which could if it now passes? The
LD_LIBRARY_PATH fixes?

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

mini-gmp testing (was: Re: Help stabilising mini-gmp)

2016-11-21 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> We should at least make sure the algorithms' corner cases are exercised,
> e.g., that large quotient are generated for Euclid's algorithm and that
> remainders of +-epsilon are used for division.

The mini-gmp testsuite is a bit hairy because it uses both gmp and
mini-gmp. This works fine with linking, thanks to the name mangling of
gmp symbols, but we can't include both gmp.h and mini-gmp.h in the same
compilation unit, and we have to types called mpz_t which in general
aren't compatible.

Most of the current tests works by invoking a function which uses the
real gmp to produce random inputs and a reference answer. E.g., (a, b,
a+b), or (a, b, gcd(a,b)). These are then translated between gmp mpz_t
and mini-gmp mpz_t via hex representation.

I think this is reasonable for simple things like add and mul, but maybe
not for the more complex things.

In particular, for division and gcd(ext), where the result is easy to
validate using multiplication and addition, there are different ways to
do things. We need to call into gmp to generate inputs or pieces of
inputs (because mini-gmp lacks mpz_urandom and mpz_rrandom). And we
obviously want to do the operation under test using mini-gmp. But we can
choose freely whether or not the code to produce inputs and validate
outputs should use gmp or mini-gmp.

One opposite way to organize the tests would be to have the main part of
the test programs use the real gmp, i.e., include gmp.h rather than
mini-gmp.h. And then do wrapper functions for the functions under test,
e.g, a mini_mpz_gcd which accepts gmp mpz_t as inputs, converts to hex,
passes on to a separate compilation unit which uses mini-gmp.h rather
than gmp.h to convert the hex inputs to mini-gmp mpz_t, call the
corresponding mini-gmp function, and convert back.

That should make it easier to copy testcases from the real gmp. The main
drawback I see is that testcases get less clear if there's a hairy
wrapper between the the testcode and the function it's testing. Complex
wrappers around generation of test inputs and validation of results
somehow suits my taste better, but I have no good rational motivation
for that.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz_gcd_ext(NULL, ...)

2016-11-25 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> a user was asking if we could support calling mpz_gcd_ext with a NULL
> first argument (the gcd), since they are only interested in the
> coefficients s and t and would like to save the unnecessary
> allocation. I doubt it would save that much, but it seems trivial to
> add a check if(g!=NULL) similar to the tests for s and t. Does it make
> sense to you?

Sounds reasonable to me. But then I'd also consider adding a return
value, returning one if the inputs were coprime (gcd == 1), otherwise
zero. Would be useful for mpz_invert. 

If we add a return value, for consistency we should do it for mpz_gcd
too, so one could define mpz_coprime_p as

#define mpz_coprime_p(u,v) mpz_gcd (NULL, (u), (v))

(Adding an integer return value is a breaking ABI change in theory, but
not really in practice, as far as I understand).

It won't save much for the computation, only the final copy. But it's
also a gain to clarity if callers don't have to pass in dummy, unused,
result arguments.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Using MPZ_ROINIT_N in mpz sources

2016-11-28 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> A single shared limb is my goal too, but it's a cost (comparable to a
> function call?) if we must use a shared symbol for it. IIRC Torbjörn said
> that the symbol hiding stuff might not be able to mitigate this cost,
> because of the PIC-only policy of some Linux distributions...
> On the other side, inlining a function with a static const var inside
> means, I believe, replicating lots of statics variables everywhere...

Sounds a bit hairy. If we need a function call, ideally, we should have
only one, and that should be calling mpz_init...

My understanding is that both data and rodata are loaded at a link time
constant offset from the code, so the address of a *static* constant would
be instruction pointer + link time constant offset. While the address of
a global constant is a single load from the GOT, which in turn is
located in the data sigment at a link-time constant offset from the
code.

So if we have

  x._mp_data = _limb;

and %rdp points to x, that should be something like

  lea (%rsp, constant_limb), %rax
  move %rax, _mp_data_offset(%rdp)

or

  mov (%rsp, constant_limb@GOT), %rax
  move %rax, _mp_data_offset(%rdp)

for a static and externally visible constant, respectively. Do I get
this right?

> One to be used in the declaration phase:
> mpz_t temp = MPZ_PREINIT(...);

I was thinking of something like

  mpz_t temp = MPZ_INITALIZER;

The nice thing about that is that it should work also for compound
initialization, like

  struct { int x; mpz_t y; } foo = { 1, MPZ_INITIALIZER; };

I think it might be useful to eventually make public too.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Seeding in mini-gmp

2016-11-24 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Should seeds really be limited to an unsigned long and at the same time
> to 4 bytes?  Both limits seem unnecessary.

It's platform dependent. On 64-bit machines, unsigned long is 64 bits
(except windows...), and we read 8 bytes from /dev/random. Which ought
to be enough.

On 32-bit machines, using gmp_randseed_ui limits the seed to 32-bits. If
that's a problem, we need to fix it; I just tried to keep things simple.

> I haven't looked deeper into the code, but if there is a seed function
> which accepts an mpz_t, then please consider using it instead.  And then
> follow GMP's example and read 6 bytes of random data from /dev/urandom.

There's gmp_randseed. I guess we could use that. Why 6 bytes (48
bits), and not, e.g., 64 bits?

Hmm, now I realize that it's quite important to be able to feed the same
seed to both 32-bit and 64-bit builds; then it's no good to use
gmp_randseed_ui with a platform-dependent range. I'll have to fix that.

>   +/* Unsigned long may be only 32 bits, and then a plain microsecond
>   +   count would wrap around in only 71 minutes. So instead, xor
>   +   microseconds with the most significant second bits, which are
>   +   the least "random". */
>   +return tv.tv_sec ^ (tv.tv_usec << 12);
>
> You probably need a cast there, else you'll typically end up with 32-bit
> arithmetic there.  (Or even better, use mpz_t here too.)

For this piece of code, I don't think we have much use for more than
32-bits. Iff time_t (type of tv_sec) is 64-bit, we'll use the bits
saying whether or not we're beyond 2038, which isn't particularly
random.

If we switch to using a 48-bit mpz_t instead, we could do this
differently and get a better range. E.g, low 48 bits of 100*tv_sec +
tv_usec would wraparound in almost 9 years, and tv_sec + tv_usec << 28
would also fill 48 bits and collide rarely.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-16 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I re-enabled it again, expect hundreds of failures due to the ABI
> adherence issue.  It's not easy to suppress those.

I think this small change should fix that, but I can't test it at the
moment.

--- a/Makefile.am   Wed Nov 16 23:07:02 2016 +0100
+++ b/Makefile.am   Thu Nov 17 07:12:08 2016 +0100
@@ -433,7 +433,7 @@ check-mini-gmp:
MINI_GMP_DIR="$$abs_srcdir/mini-gmp" \
LDFLAGS="-L../../.libs" \
LIBS="-lgmp -lm" \
-   CC="$(CC_FOR_BUILD)" EXTRA_CFLAGS="-g -I../.." check
+   CC="$(CC)" EXTRA_CFLAGS="$(CFLAGS) -g -I../.." check
 
 clean-mini-gmp:
    if [ -d mini-gmp/tests ] ; then \


Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-15 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I realise that mini-gmp's requirement of GNU make hurts GMP's build
> environment compatibility testing; until now, we have tried to use plain
> 'make' on almost all systems.  If mini-gmp now forces the entire build
> to use GNU make, then that's bad.

I agree. Is there any easy way to use GNU make only for make
check-mini-gmp? Either in test scripts, or automatically in that target
(the make used for the target itself doens't matter, it's the recursive
make, currently using $(MAKE), that really needs GNU make.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-15 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> https://gmplib.org/devel/tm/gmp/check/failure/bwldeb64v7.gmplib.org-dyn-fat:64.txt

That's t-powm. I can reproduce that one locally.

  GMP_CHECK_RANDOMIZE=1479273572 ./t-powm

> So we might have 3 pending arithmetic bugs in mini-gmp (unless the gcd
> one is a clang bug).

I left

  while GMP_CHECK_RANDOMIZE=0 ./t-gcd ; do : ; done

running over night, with no failures (64-bit x86_64, gcc-6). 

But I can reproduce the problem with the data from
shell.gmplib.org-stat-clang35-clang++35-fat:32.txt, 

   GMP_CHECK_RANDOMIZE=1479112190 ./t-gcd

So this one isn't clang only.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-15 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> For those issues which result in a check/failure entry, I have marked
> them as "expected" (by means of ~gmp/gmp/expected_check_failures).  I
> did that to make sure these failures do not hide any problems with GMP
> itself.

I tried to mark the mini-gmp t-limbs section with , but saving my
changes from emacs failed, 

  Cannot write backup file; backing up in /home/nisse/%backup%~
  basic-save-buffer-2: Doing chmod: operation not permitted, 
/home/gmp/gmp/expected_check_failures

> If you check in a fix for any of these problems, please strike the
> entire item by using   Please also clean up the
> ~gmp/gmp/check/failure report files.

How, more precisely? Just delete the corresponding files? E.g,

  shell.gmplib.org-dyn-gcc49-g++49:64.txt
  shell.gmplib.org-dyn-gcc49-g++49:64.txt.gz

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-19 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I made that and another change to now get the test to run on the
> majority of our systems.  The other change was this one:
>
>   https://gmplib.org/repo/gmp/rev/88fead579828
>
> This makes sure linking knows the ABI. 

To make sure I understand the issue, this is to get flags like -m32
passed on to the linker?

> I am not sure there are no side-effects conflicting with the author's
> intended Makefile behaviour.

I don't think so. But I think the right way is to simply add $(CFLAGS)
to the linker command line. That's the usual way to use it, right? And
omitting it in the linking rule in mini-gmp/tests/Makefile is a bug.

> Another problem with the Makefile.am check-mini-gmp code has surfaced.
> It overrides LD_LIBRARY_PATH (and DYLD_LIBRARY_PATH) for the entire
> mini-gmp build.  That includes gcc, which as you know is linked to GMP.
> The result is compilation problems on at least one platform.

Ah, that's an interesting side effect. To do something more specific to
the tests, how should that be named? Should we adopt the EMULATOR
variable (from automake, not sure if it's still in the current version)?
Then one would set

  EMULATOR="LD_LIBRARY_PATH=... DYLD_LIBRARY_PATH=..." \
  $(MAKE) ...

when running mini-gmp tests. I think EMULATOR is already supported by
the version of run-tests script which is in mini-gmp, it runs the test
programs ("$1" is the executable to be run) using 

  if [ -z "$EMULATOR" ] || head -1 "$1" | grep '^#!' > /dev/null; then
"$1" $testflags
  else
$EMULATOR "$1" $testflags
  fi

> There is also a problem with t-signed on ppc64 machines.

I noticed one of those failures, and it was't trivially reproducible on
x86_64. Needs some investigation.

The first failure was also a bit interesting, an internal compiler error
with gcc-6.

In file included from gmp/mini-gmp/tests/testutils.c:24:0:
gmp/mini-gmp/tests/../mini-gmp.c: In function 'mpz_set_d':
gmp/mini-gmp/tests/../mini-gmp.c:1647:3: internal compiler error:
Aborted
   if (x != x || x == x * 0.5)
   ^~

https://gmplib.org/devel/tm/gmp/build/failure/sky.gmplib.org-dyn-fat-fake:64.txt

Begs for a proper gcc bug report, I guess. I fail to log in to sky, via
nshell, at the moment ("ssh -p  localhost" on shell fails with
connection refused). And this failure was on the non-virtualized host
sky, right, not on one of the vms?

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-19 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> ni...@lysator.liu.se (Niels Möller) writes:
>   I don't think so. But I think the right way is to simply add $(CFLAGS)
>   to the linker command line. That's the usual way to use it, right? And
>   omitting it in the linking rule in mini-gmp/tests/Makefile is a bug.
>   
> Since CFLAGS includes -Iblah it seemed like a uglier choice.

Changed to pass CFLAGS when linking (according to GNU standards), and
move -I flags to CPPFLAGS where they belong.

> It seems "trivially" reproducible on ppc64 though, both real metal
> versions and fake ones like ppceb-debv8 (and ppcel-debv8).

I'll try to debug (if no one else beats me to it), but not today.

>   In file included from gmp/mini-gmp/tests/testutils.c:24:0:
>   gmp/mini-gmp/tests/../mini-gmp.c: In function 'mpz_set_d':
>   gmp/mini-gmp/tests/../mini-gmp.c:1647:3: internal compiler error:
>   Aborted
>  if (x != x || x == x * 0.5)
>  ^~
>
> That's the LD_LIBRARY_PATH bug, I think.

That could explain it (I was first thinking that "ICE" meant a failed
assert in the compiler code, but I guess any SIGSEGV or similar shows
the ICE message?). 

I just pushed an LD_LIBRARY_PATH improvement, we'll see if it reduces
the number of failures.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-20 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> After inlining, there are subtractions. check_si is called at least
> once with oi = si + c (c is ±1). gcc simplifies the test si > si - 1
> to true, and warns that this optimization may break your program if
> you rely on wrapping.

I suspect this test program relies on it. With c == 1, si takes the
values 2, 4, 8, ..., (1<<62), (-1<<63), and I guess the check gcc warns
about is intended to detect overflow and act as a stop condition.

And my adding of debug printf might have pushed the size of the function
over some threshold so that it's no longer inlined, comparison not
optimized away, and then the test succeeded.

> The usefulness of such a warning is debatable, and we tend to drop
> some of them from gcc when we think nobody will notice.

Not sure. There are two dangers: Programs relying on undefined behavior,
and optimization based on the assumption that there will never be any
undefined behaviour.

Anyway, I can now repro locally, by running 

  make check CFLAGS="-O -Wall -g -fsanitize=undefined -fno-sanitize-recover"

in the mini-gmp/tests source directory. This fails with

  t-signed.c:93:8: runtime error: signed integer overflow: -1 + 
-9223372036854775808 cannot be represented in type 'long int'
  FAIL: t-signed

It would make sense to test both gmp and mini-gmp with
-fsanitize=undefined.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-15 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> ni...@lysator.liu.se (Niels Möller) writes:
>
>   I tried to mark the mini-gmp t-limbs section with , but saving my
>   changes from emacs failed, 
>   
> My bad, please try now.

Worked better now.

>   How, more precisely? Just delete the corresponding files? E.g,
>   
> shell.gmplib.org-dyn-gcc49-g++49:64.txt
> shell.gmplib.org-dyn-gcc49-g++49:64.txt.gz
>   
> Yes, if permission allows it.  :-)

Done, 6 files deleted.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-20 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> Should we implement (or adapt) something like the tests/devel/try thing to
> widely test the basic operations in mini-gmp?

I think devel/try is mainly for testing assembly code, to find register
clobbering bugs and the like.

For mini-gmp, I hope running the testsuite with a large number of random
seeds should give good coverage, so we don't need any additional
framework.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Reorganized mini-gmp/tests/t-signed.

2016-11-20 Thread Niels Möller

ni...@lysator.liu.se (Niels Möller) writes:

> Anyway, I can now repro locally, by running 
>
>   make check CFLAGS="-O -Wall -g -fsanitize=undefined -fno-sanitize-recover"
>
> in the mini-gmp/tests source directory. This fails with
>
>   t-signed.c:93:8: runtime error: signed integer overflow: -1 +
> -9223372036854775808 cannot be represented in type 'long int'
>   FAIL: t-signed

I've checked in a fix for this.
https://gmplib.org/repo/gmp/rev/6a2a9d2f639c

Marco, would mind having a look at the reorganized test? I deleted
checks at the end of the loop which I didn't quite understand, and which
didn't quite fit with the new stop condition.

The idea is to instead of checking for overflow by examining signed
integer values, compute the mpz version first and use mpz_fits_slong_p
to check if the computation on signed long makes sense. And I added
tests to check that that function agrees with LONG_MAX and LONG_MIN.

I also avoid even computing the signed values when they wouldn't fit, to
make it possible to use -fsanitizer=undefined without complaints. 

It would be cool if -fsanitizer=undefined could cooperate with valgrind,
and not complain immediately when there's an undefined value, but
instead taint that value, and complain if it or anything which depends
in it is ever used for a branch condition, memory access or system call.
(And strictly speaking, that would make sense only for implementation
defined behaviour, not undefied behaviour, I guess).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-20 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> There are currently 10 outstanding mini-gmp issues according to
> <https://gmplib.org/devel/mini-gmp-status.html>, will be 9 if your
> LD_LIBRARY_PATH change is effective.  Is it up-to-date?

I think the three red arithmetic bugs (t-gcd, t-powm and t-div) were all
caused by the bug in mpn_invert_3by2, so hopefully fixed now. Or have
you seen any new failures for those?

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-21 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> 2. The latest Gentoo doesn't support -fno-sanitize-recover.  I suppose
>it works without it?

As I understand it, the point of -fno-sanitize-recover is to make
programs crash on the spot when there's some undefined operation. Not
sure what the behaviour is without that, if the program might complete
normally with just logging of problems, or if detected problems still
makes it exit with non-zero status.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Help stabilising mini-gmp

2016-11-14 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I created a web page for tracking the mini-gmp problems:
>
>   https://gmplib.org/devel/mini-gmp-status.html

From that page:

  (P2) The seeding triggered by GMP_CHECK_RANDOMIZE is apparently based on
  seconds since epoch (with some funny skew). This is not good enough, as
  tests which run at the same time will get the same seed. 

Seed is currently time(NULL) + getpid(), so a loop like

  while GMP_CHECK_RANDOMIZE=0 ./t-foo ; do : ; done

will typically use an incrementing sequence of seeds, and not repeat
until the pid space wraps around.

Main gmp tests seem to use /dev/urandom if available, falling back to
gettimeofday if available, otherwise only the time function (no pid).

For mini-gmp tests, it was my intention that tests should be able to run
standalone from the gmp tree (assuming gmp and GNU make is installed on
the system, just "make -C tests check" in a mini-gmp source tree should
work). So it shouldn't depend on any configure tests, and ideally should
use only standard C functions, or the most mainstream functions outside
of the standard. (And depending on getpid is not ideal... but a
significant improvement over only time(NULL)).

Attempting to open /dev/urandom seems very reasonable to me. If we do
that, the remaining question is how elaborate a fallback we need if
opening fails.

If really needed, we could also propagate certain gmp configure values
in the make check-mini-gmp target, e.g., HAVE_GETPID and
HAVE_GETTIMEOFDAY.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Using MPZ_ROINIT_N in mpz sources

2016-11-27 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> I do not remember if we require C99 to compile the library...
>
> Can we use MPZ_ROINIT_N inside our sources to improve readability? e.g.

For what it's worth, IIRC the non-constant initializer used by
MPZ_ROINIT_N has been supported by gcc more or less forever, at least
since 2.3 from the mid 1990s. And it would make some sense to me to
require c99 or gcc.

> Moreover, can we slightly abuse it to obtain a static lazy init w/out
> calling the _init function? E.g.

If we want to do that, I think we should have a separate macro for the
initializer (and preferably point to some shared read-only limb). Or
make mpz_init inline?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz reuse test takes too much time

2016-12-02 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Target operands which are not also input operands are now overwritten
> with garbage via the CLOBBER macro.  (It would make sense to do this in
> almost every other GMP test file.)
>
> I split your gcdext check macros into 2-operand and 3-operand variants.
> The old macros checked res3 even when it was not part of the
> computation, exhibiting possibly undefined behaviour.
>
> OK with you to commit?

Looks good to me, go aehad.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz reuse test takes too much time

2016-12-03 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

>> ! #define GCDEXT_CHECK2(i1, i2) do { \
>> !mpz_gcdext (res1, res2, NULL, i1, i2);  \
>> !MPZ_CHECK_FORMAT (res1);\
>> !MPZ_CHECK_FORMAT (res2);\
>> !if (mpz_cmp (ref1, res1) != 0 || mpz_cmp (ref2, res2) != 0) \
>> !  FAIL2 (mpz_gcdext, i1, i2, NULL); \
>> !  } while (0)
>
> Should we check both mpz_gcdext (res1, res2, NULL, i1, i2); and mpz_gcdext
> (res1, NULL, res2, i2, i1) in this macro?

Hmm, you changed the code to not only allow G to be NULL, but allow NULL
S too? 

I think that's unnecessary. As you say, mpz_gcdext(G, NULL, NULL, A, B)
is better done as mpz_gcd(G, A, B), and mpz_gcdext(*, NULL, T, A, B) is
equivalent to mpz_gcdext(*, T, NULL, B, A). (Maybe in the latter case,
return value can differ in some corner case, so I'm not sure it's 100%
equivalent, but in general swapping the inputs A and B implies swapping
of the S and T outputs).

So allowing NULL S adds more cases to implement and test, with no gain
in functionality.

> Since we currently support NULL also for the first argument, should we
> check also the cases
> mpz_gcdext (NULL, x, y, x, y);
> mpz_gcdext (NULL, y, x, x, y);
> mpz_gcdext (NULL, s, y, x, y); mpz_gcdext (NULL, y, t, y, x);
> mpz_gcdext (NULL, s, x, x, y); mpz_gcdext (NULL, x, t, y, x);

> mpz_gcdext (NULL, NULL, y, x, y); mpz_gcdext (NULL, x, NULL, x, y);
^
> mpz_gcdext (NULL, NULL, x, x, y); mpz_gcdext (NULL, y, NULL, x, y); ?
^
Maybe, except for the above two.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz reuse test takes too much time

2016-12-03 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> On Sat, 3 Dec 2016, Niels Möller wrote:
>
>> Hmm, you changed the code to not only allow G to be NULL, but allow NULL
>> S too?
>
> That was already the case before my patch. 

Sorry for the confusin, I had forgotten it was your change.

> Since s and t may get swapped depending on the values of a and b, you
> cannot really support t==NULL without supporting s==NULL. You may
> prefer not documenting it though...

I think that's an implementation detail that shouldn't be documented.
Can we just strike it from the docs? (And possibly back it up with an
ASSERT at function entry).

In the case

  mpn_gcdext(G, S, NULL, A, B)

with size(A) < size(B), IIRC we currently handle that by swapping
arguments, calling mpn_gcdext to compute T (even though the caller
didn't need that, and produce S with an multiplication using multiply +
exact division.

That doesn't seem ideal. The alternative would be to do an up-front
division, and call mpn_gcdext with inputs A, B mod A, and some different
postprocessing (I don't have the cofactor stuff in my head at the
moment, it should be somewhere between easy and trivial, and likely
involving the quotient floor (B/A).

One also needs to keep in mind that if |A| < |B|, then typically |T| <
|S| too. Which could influence algorithm choice if the size difference
|is large.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Changes for GMP 6.1.2

2016-11-29 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Whatever is safer.
>
> I think GMP isn't jealous, it can handle being less lazy than mini-gmp.

I think it will be easier for us to keep track if we copy head mini-gmp
(except the ChangeLog file which now needs special handling).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz reuse test takes too much time

2016-12-02 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Please check code spacing and . vs , in comments.

Done and checked in.

> Shouldn't res1, res2, res3 be overwritten with some fixed garbage when
> they are not also input operands?

This remains to do, as well as updating the corresponding mini-gmp test.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz reuse test takes too much time

2017-01-01 Thread Niels Möller

ni...@lysator.liu.se (Niels Möller) writes:

> In the case
>
>   mpn_gcdext(G, S, NULL, A, B)
>
> with size(A) < size(B), IIRC we currently handle that by swapping
> arguments, calling mpn_gcdext to compute T (even though the caller
> didn't need that, and produce S with an multiplication using multiply +
> exact division.

I've looked a bit more at this, and it's not at clear to me what
strategy one should use.

The cofactors satisfy

  G = S A + T B,

where S is the same order of magnitude as B/G, and T is of the same
order of magnitude as A/G. For simplicity, start with the case of small
G so we don't have to care about its effect on the sizes, and assume
that B is n limbs, A is m limbs, and n > m. Then we'd also expect S to
be n limbs and T to be m limbs.

Now we have two possible strategies: 

1. Call mpn_gcdext (B, A), which produces T. And compute S as

 S = (G - T B) / A

   that's one n x m multiply for T B, and one (n + m) / m exact division.

2. First do a division up front, 

 B = Q A + R

   and call mpn_gcdext(A, R). We then get a cofactor S', of m limbs. But
   we also need the other cofactor T', because the wanted cofactor S is

 S = S' - Q T'

   That's one (n - m) x m multiply. To get T', we have to use

 T' = (G - S' A) / R

   Which is a m x m multiply and an 2m / m exact division. I had a quick
   look at also substituting Q = (B - R)/A into the equations, but that
   didn't lead to any simplifications.

   (And we don't count the cost of the initial division, bacause it will
   be needed for strategy (1) too, done by mpn_gcdext).

So the question is, which cost is smallest,

   MUL(n, m) + DIVE(n + m, m), or 

   MUL(m, m) + MUL(n - m, m) + DIVE(2m, m) ?

With subquadratic multiplication, I'd expect 

   MUL(n, m) < MUL(m, m) + MUL(n-m, m)

or am I confused? But DIVE(n + m, m) > DIVE(2m, m), since the quotient
is larger.

And a third strategy could be to extend mpn_gcdext to support A < B, and
hence compute the larger cofactor directly. But I'd guess that would be
more work, since gcdext uses a quadratic algorithm for much larger sizes
than for multiplication and division.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Test compile: mpn_lshift_com optimization 2

2017-01-04 Thread Niels Möller

Richard Biener <rguent...@suse.de> writes:

> This test intoduced with GMP 6.1.2 FAILs with malloc debugging
> because it appearantly relies on malloc returing zeroed memory?
> Or it tests the wrong array elements?

Can you please clarify which test program fails? There's no "t.c" or
"a.out" files in my gmp tree. Please also provide the additional
information requested in the Bug reporting instructions in the manual.

For what it's worth, 

  TESTS_ENVIROMENT=valgrind make check

works fine in my gmp build.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpz_gcd_ext(NULL, ...)

2017-01-01 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> While we are looking at gcdext, I propose a small change to save an
> allocation when t must be computed. Can you review it, to check if the
> size assumptions are safe?

Sorry, I haven't looked too closely at this code...

I guess it still allocates an extra limb for some arguments, which no
longer is required by mpn_gcdext.

> PTR () = tmp_gp;
> SIZ () = gsize;
>
> PTR () = tmp_sp;
> SIZ () = tmp_ssize;

This style makes my uneasy... I'd prefer to either use real
reallocatable mpz_t for destination operands, or use mpn functions
directly... But that's unrelated to this change.

You did one more thing in the committed change 84ce52f0d360:

--- a/mpz/gcdext.c  Tue Dec 13 19:20:31 2016 +0100
+++ b/mpz/gcdext.c  Sat Dec 17 21:26:50 2016 +0100
@@ -64,7 +64,7 @@ mpz_gcdext (mpz_ptr g, mpz_ptr s, mpz_pt

   if (g != NULL)
{
- gp = MPZ_REALLOC (g, asize);
+ gp = MPZ_NEWALLOC (g, asize);
  MPN_COPY (gp, PTR (a), asize);
  SIZ (g) = asize;
}

That will break in case g == a, right? So we'd need to either keep
MPZ_REALLOC, or check for that special case. So I'm afraid the reuse
test doesn't hit this corner...

Regards, and Happy New Year,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Test compile: mpn_lshift_com optimization 2

2017-01-04 Thread Niels Möller

Emmanuel Thomé <emmanuel.th...@inria.fr> writes:

> On Wed, Jan 04, 2017 at 03:08:57PM +0100, Niels Möller wrote:
>> I'm also a bit puzzled. And after a quick look, I can't find any
>> lshift_com function in mul_fft.c (or anywhere else but acinclude.m4).
>
> It's mpn_lshiftc (and by the way the comment in ./mpn/generic/lshiftc.c
> is misleading -- lshiftc is not lshift).

Interesting. I thought that mpn_lshiftc was analogous to, e.g.,
mpn_add_nc and mpn_addmul_1c, with an additional argument providing the
bits to be shifted in. But it seems I were completely wrong.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_sqrtrem1

2016-12-20 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> I decided to try an as-branch-free-as-I-can Karatsuba implementation for
> umul_ppmm in the simpler environment: mini-gmp.
>
> Apply the following patch, define UMUL_PPMM_USE_KARATSUBA, and experience
> a slowed-down, larger-code version of mini-gmp ;-D

> +__x1 = (mp_limb_t) __ul * __vl;  \
[...]
> +if (__vh)\
> +  {  \
> + __x2 += __x1;   \
> + __ul += __x2 < __x1;\
> +  }  \
> +else \
> +  {  \
> + __ul -= __x1 > __x2;\
> + __x2 -= __x1;   \
> +  }  \

I think I'd arrange this as a branch free conditional negation +
sign-extend of __x1. If __vh represents sign as 0 or ~0, that would be
something like

  __x1 = (__x1 ^ __vh) - __vh;

Now {__vh, __x1} is a two-limb two's complement representation of the
product (__ul - __uh) * (__vl - __vh), where the notation in the latter
expression refers to the initial values of these variables. Which can be
accumulated into the final product.

But it still won't beat the plain version on any machine with a
reasonable mul-instruction.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: More mini-gmp asserts?

2017-07-23 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> ni...@lysator.liu.se (Niels Möller) writes:
>
>   Hi, I'm considering adding no-overlap asserts to mini-gmp's mpn_mul
>   function. Any reason not to? If we do it by copying the MPN_OVERLAP_P
>   macro, we should add a GMP_-prefix for namespace reasons.
>   
> This makes sense to me (since I haven't participated actively in
> mini-gmp's development I don't have a string opinion).

Done. Maybe such asserts should be addded in more places in mini-gmp?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_invert_limb

2017-07-23 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> These date back to 68020 and 80386.  I.e., quite obsolete.  :-)

Gone now.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mpn_sqrtrem{1,2} - patch for pure C implem

2017-06-09 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> The constant (c) I added to the first iteration should be incorporated in
> T1, and of course it is not necessary to really use a constant. A
> different value can be used for each entry in the table...

In Newton iteration, one usually converges from one side, and it's
tempting (to ease analysis and enable use of unsigned arithmetic) to
maintain a known and fixed sign of the error. But one could gain about
one bit of accuracy with error range roughly centered around zero.

For the initial iteration with tabulated values, you'd adjust each table
entry, as you say. In later iterations, one could add a suitably choosen
constant if one is willing to handle signed adjustments cancellation
logic.

When I worked on reciprocal quite a while ago, I aimed for a signed
symmetric error for the initial table lookup, but for the main
iterations, I designed for a non-negative error and I didn't explore
using signed errors.

> I believe that we can define
>
> h(a) = a >> GMP_NUMB_BITS - 9 /* the highest 9 bits */
>
> l(a) = (a >> GMP_NUMB_BITS - 20) & ((1<<12)-1) /* the next 11 bits */

One could save an instruction with using just

  l(a) = (a >> GMP_NUMB_BITS - 20)

and adjusting the tables accordingly. You'd get a larger multiply, but
limiting to 16x16 is probably not a saving on relevant platforms. 

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: GMP's x86-32 performance

2017-06-17 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Our latest batch of x86-32 code dates from 2011 (for the original Intel
> atom) but we have not done anything for high-end AMD and Intel CPUs
> (e.g., AMD k10, bulldozer, piledriver, steamroller, excavator, zen, or
> Intel penryn, nehalem, sandybridge, ivybridge, haswell, broadwell,
> skylake, kabylake) in a very long time.

When are those later cpus run in 32-bit mode? M$ windows or mac
applications? I would have expected 64_64 mode, possibly with some use
of the x32 abi (small pointers), to be used almost exclusively by now.

> What do I have in mind?  I believe pmovzxdq, pmuludq, psrlq (or some
> shuffle insn), and paddq could be used to build an addmul_2 which runs
> at at close to 1 cycle/limb using sse2,

I think I looked at pmuludq in the past, the variant doing two 32x32->64
multiplies, without having any success. IIRC, the throughput of that
instruction on then current cpus was too poor to make it useful. Other
possible reasons for failure: (i) I didn't try hard enough, (ii) too
much shuffling around of the operands are needed.

BTW, speaking of addmul_2. Where current addmul_2 wins over addmul_1,
that's because we get more independent mul instructions and can more
easily saturate multiplication units. At least, that's my understanding.

We've considered using karatsuba aka toom2 for addmul_2, but it has
always turned out that saving 1/4 of the multiply instructions is very
easily eaten up by the additional operations needed. But the other day,
it striked me that we might also try doing addmul_2 using toom32, which
would save 1/3 of the mul instructions. Toom32 is nice because we can
use the four easiest evaluation points: 0, infinity, and +/-1.

Or addmul_3 using toom32, which has the additional advantage that more
of the evaluation work is loop-invariant, and we could also jump to
separate innerloops depending on the carry bits from evaluation.

Perhaps this is still crazy, and useful only for machines with very slow
multiplication.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

addmul_k with toom (was: Re: GMP's x86-32 performance)

2017-06-17 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> ni...@lysator.liu.se (Niels Möller) writes:

>   we might also try doing addmul_2 using toom32, which
>   would save 1/3 of the mul instructions. Toom32 is nice because we can
>   use the four easiest evaluation points: 0, infinity, and +/-1.
>   
> Give it a shot!  A concern with any toom stuff is side-channel leakage.
> We currently assume _basecase is non-leaky.  So any low-level trickery
> needs to be applied above the _basecase cutoff points (unless we split
> out _basecase_sec functions).

If we do this with toom32, we'd need a separate basecase_sec. I think
there's little chance to gain any speedup if we enforce side-channel
silence. Looks better for toom2, there I think it costs only a single
instruction in the innerloop to handle all signs in a side-channel
silent fashion. If we don't trust cmov to be side-channel silent, we'd
also need to do conditional negation using a mask and xor + add instead
of cmov, unclear what that means for performance.

>   Or addmul_3 using toom32, which has the additional advantage that more
>   of the evaluation work is loop-invariant, and we could also jump to
>   separate innerloops depending on the carry bits from evaluation.

I've looked a bit more at this, writing some pseudo C code for the
simplest case where both evaluated values v0 + v1 + v2 and v0 - v1 + v2
are even, non-negative, and fit in a single limb after division by two.

One iteration of addmul_3 would then need roughly 5 alu instructions for
evaluation, 4 multiplies, 8 instructions to combine products into odd
and even terms (u0 v0 + u1 v1 + u0 v2) and (u0 v1 + u1 v0 + u1 v2), and
another 14 for the largish final additions, including the three carry
words from the previous iteration.

If we divide by 6 (since this corresponds to the work of 6 iterations of
addmul_1), that's 2/3 multiply instructions and 4.5 alu instructions.

A sketch for addmul_2 using toom2 (and evaluating at the -1 point) gives
3 instructions for evaluation, 3 multiplies, 13 instructions for
interpolation and addition. Divide by 4, to get 3/4 multiply and 4 alu
instructions to do the same amount of work as a single addmul_1
iteration. 

To compare to x86_64/aorsmul_1.asm, which uses one multiply and 3 adc
instructions per iteration, and at best runs at 2.5 cycles per limb, and
x86_64/coreibwl/addmul_1.asm, which uses an iteration consisting of
mulx, adox, adcx, and runs at about 1.66 cycles per limb.

> Slow hardware multiplication helps.  :-)

At the list https://gmplib.org/devel/asm.html, we have the "Intel noc"
(what's that?) with current addmul_1 running at 15 cycles per limb, and
Intel Atom where it runs at 19 cycles / limb.

Writing up an addmul_2 for intel atom using toom2 is probably the most
promising step in this direction.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Fast mpz_bin_ui

2017-10-06 Thread Niels Möller

Fredrik Johansson <fredrik.johans...@gmail.com> writes:

> The current mpz_bin_ui is not asymptotically fast. I propose replacing it
> with a call to mpz_bin_uiui when n fits one word (as already suggested in a
> code comment) and a plain divide and conquer product otherwise.

If I get this right, the speedup comes from a more balanced
multiplication tree for n (n-1)...(n-k+1), right?

I wonder if one can do something more clever. mpz_bin_ui_ui seems to
count the multiplicities of primefactors, somewhat related to the
factorial code, I think.

It would be nice to avoid having to compute k! and divide it out at the
end.

For a start, consider the powers of two, and for concreteness, say n =
1000, k = 15. Let !! denote semifactorial, k (k-2) (k-4) 

We can compute k! = 15!, taking out powers of two, as

  15! = 15!! 14!! = 2^7 15!! 7! =  2^7 15!! 7!! 6!!
  = 2^10 15!! 7!! 3!
  = 2^11 3^3 (5*7)^2 (15*13*11*9)

(Further sieving could take other primes into account too, the current
factorial code does that, I think)

To do the same splitting for de numerator n(n-1)...(n-k+1), we have

  1000*999*...*986 = (1000*998*...*986) (999*997*...987)
= 2^7 (500*499*...*493) (999*997*...987)
= 2^7 (500*498*496*494) (499*497*495*493) (999*997*...987)
= 2^11 (250*249*248*247) (499*497*495*493) (999*997*...987)
= 2^11 (250*248) (249*247) (499*497*495*493) (999*997*...987)
= 2^13 (125*124) (249*247) (499*497*495*493) (999*997*...987)

We could go on to take out factors of two of 124, but above is enough to
cancel out the power of two in 15!. Unlike the factorial case, we don't
get overlapping ranges and no nice squares or powers of products

 So we would get 1000 over 15 as

2^2 (125*124) * (249*247) * (499*497*495*493) * (999*997*...987)
  ---
   3^3 (5*7)^2 (15*13*11*9)

We could do sieving to take out the other factors of k!, but it's not
clear to me that would be an improvement. Assuming n is a bignum,
computing k! and dividing by it will be a fairly small operation, and
taking of the factors early won't save much work in computing the
numerator, as far as I see.

> Related to this, it would be useful to have a public function that computes
> rising factorials (say mpz_rising_ui). Then mpz_bin_ui could be implemented
> very simply by calling either mpz_bin_uiui or
> mpz_rising_ui+mpz_fac_ui+mpz_divexact.

There are a couple of building blocks in oddfac.c and prodlimbs.c, but
all for single-limb inputs, it seems.

If we want to do something reasonably simple, it would make sense to use
mpz_oddfac (which computes the odd part of the factorial), and mpz_fall_fac or
mpz_odd_fall_fac for computing the falling factorial or the odd part
thereof.

I'm not familiar with the sophisticated things done for mpz_bin_uiui and
mpz_fac, but maybe parts of that could be applicable also to the falling
factorials of a bignum?

It might work to first take out powers of two, and then rewrite
multiplies as squares like

  n (n-2)   = (n-1)^2 - 1
  n (n-2)(n-4)(n-6) = [(n-3)^2 - 5]^2 - 25

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Fast mpz_bin_ui

2017-10-12 Thread Niels Möller

Let me write some more, after pondering the problem for a few days.

Regular powers n^k (mod m) can be efficiently computed based on repeated
squaring and the basic rule that

  n^{2k} = (n^2)^k = (n^2)^k

(and without mod m, one would use the same algorithm, it's just that the
size of the result grows linearly with k).

First, let's note that there's unlikely to be any comparatively
efficient way to compute falling factorial powers, for it there were,
there would be an efficient way to compute n! (mod m), and that would be
a fast factoring algorithm: if n is the product of two unknown primes,
then the smallest factor is

  p = gcd (n, floor(sqrt n)! mod n )

The closest analog of a squaring rule for falling factorial powers I've
been able to find is

  n^_{2k} = n^{_k} (n-k)^{_k}

where n^{_k} denotes the falling factorial n (n-1) ... (n-k+1). And the
right hand side doesn't look much like a square, I'd wish we could
rewrite it to a formula like

  n^_{2k} = ((n-k/2)^{_k})^{_2} + some small number

but I'm afraid we can't.

So, back to binomial coefficients, 

 n \over k = n^{_k} / k!

and the case where n is larger than one limb, k a single limb, and k <=
2n. I guess k has to be reasonably small, for the result to fit in
available memory (but I don't have any numbers handy). Then k! is going
to be much smaller than n^{_k}, so that computing k! and doing the exact
division should be a very small fraction of the total work.

So for k!, I'd suggest just using mpz_oddfac_1 (and it would make sense
to me to change it to return the exponent for the omitted power of two,
which from the implementation of mpz_fac_ui seems to be k -
popcount(k)).

For n^_{k}, I would suggest a function to compute the odd part and
return the exponent of the power of two, which would repeatedly split in
even and odd numbers and compute products of the form

  n (n-2) (n-4) ... for odd n.

Each of these products could be computed by grouping four terms at a
time and use

>   n (n-2)(n-4)(n-6) = [(n-3)^2 - 5]^2 - 25

I like this trick, replacing three multiplies by two squares (but it's
unclear how much it will save compared to the total running time). For
the reasons given above, I wouldn't expect the trick to generalize much
further.

Then multiply all the resulting odd factors together in some roughly
balanced tree. And then we could do something like

void
mpz_bin_ui (mpz_ptr r, mpz_srcptr n, unsigned long int k) 
{
  mpz_t k_fac;
  mp_bitcnt_t k_exp, n_exp;

  mpz_init (k_fac);
  k_exp = mpz_oddfac_1 (k_fac, k);
  n_exp = mpz_odd_falling_fac (r, n, k);
  ASSERT (k_exp <= n_exp);
  mpz_divexact (r, r, k_fac);
  mpz_mul_2exp (r, r, n_exp - k_exp);

  mpz_clear (k);
}

Would you like to give it a try?

One could try some sieving and instead produce n^_{k} as a product of
prime powers for all primes <= k and a residual, to be able to easily
subtract the corresonding prime-power representation of k!.

It would be interesting with some analysis of the size of n \over k
after dividing out all prime factors smaller than k. If the prime powers
of primes <= k makes up a large part, sieving could pay off, since that
part can be computed efficiently.

Will n \over k be more or less "smooth" than a random number of similar
size? I'd guess slightly more smooth, since it's going to be huge
compared to n, but still can't have any prime factors larger than n. But
we can still have lots of prime factors in the range k < p <= n.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Revisiting gcd and gcdext for small operands

2017-12-01 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Nisse and other great common dividers!
>
> Please consider these measurements:
>
> ibr32$ ./tune/speed -p1000 -c -s1-60 -f1.2 -r mpn_mul mpn_gcd mpn_gcdext
>   mpn_mul   mpn_gcdmpn_gcdext
> 1  #39.146.5512   17.1607
> 2  #50.77   12.4422   51.4993
> 3  #66.27   36.0216   57.9561
> 4  #87.17   40.7439   58.6398
> 5 #110.95   41.7779   57.3457
> 6 #136.56   42.4274   54.8383
> 7 #165.67   42.7125   51.9271
> 8 #205.14   39.7861   47.5393
> 9 #249.47   37.0144   42.8386
> 10#289.77   36.0182   40.6772
> 12#398.58   32.2387   34.2015
> 14#516.54   28.4706   28.4355
> 16#653.70   25.3546   25.7443
> 19#887.40   21.4907   20.0516
> 22   #1196.18   17.9833   17.4368
> 26   #1617.71   15.1357   15.3959
> 31   #2196.36   12.3736   13.3466
> 37   #2963.459.1705   12.3230
> 44   #3970.829.4049   11.9397
> 52   #5309.947.8460   11.6024
>
> ibr64# tune/speed -p1000 -c -s1-60 -f1.2 -r mpn_mul mpn_gcd mpn_gcdext
>   mpn_mul   mpn_gcdmpn_gcdext
> 1  #34.519.5965   43.4851
> 2  #41.46   24.7985   91.1526
> 3  #63.37   60.3494   89.9912
> 4  #80.39   69.5938   92.7650
> 5 #105.76   69.8544   88.4005
> 6 #134.70   68.6191   83.3540
> 7 #173.62   64.7691   74.9246
> 8 #208.92   62.9363   72.0671
> 9 #262.49   56.4922   63.4748
> 10#314.04   52.3732   59.2200
> 12#437.50   46.1083   50.0230
> 14#578.39   41.1333   42.6137
> 16#714.52   38.2636   38.8416
> 19   #1021.12   32.3526   30.0309
> 22   #1274.16   29.6524   28.3373
> 26   #1653.53   26.4316   25.1063
> 31   #2283.67   22.3448   20.5088
> 37   #3185.51   16.2611   14.3919
> 44   #4003.10   16.8047   15.6203
> 52   #5287.91   12.4089   16.1354
>
> Without detail knowledge of the current implementation, these numbers
> suggest to me that we could speed small operand gcd and gcdext.  But I
> might be wrong.

Hmm. I'd reason like this: All three functions are expected to be
quadratic for small sizes. And then mul should become faster relative to
the others above the Karatsuba threshold, since the gcd thresholds are
much higher then the Karatsube threshold.

But above numbers don't look like that at all. Which might indicate that
the linear term of the complexity is *much* higher for gcd than for mul.

Do I read the data the same way as you?

> Would asm "binary algorithm" code also make sense for, say, n = 10?  I
> have some doubt, as C code around the current (asm) primitives add_n,
> sub_n, rshift probably is close to optimal.

I'd guess that assembly code for small sizes, which keeps the operands
in registers, could bring some speedup. Could start with just n=2, to
see what the gain is.

But before doing that beyond n = 2, I think we should consider an
assembly implementation of hgcd2. That's the main linear part of
gcd-complexity.

> I had an idea now for using a Hensel norm Euclidean algorithm.  Finding
> a^(-1) mod 2^b where b is the limb size in the inner loop would be
> expensive, but how about letting b = 8 or thereabout?  Then a table
> lookup would work for the inverses.  Does that make sense?

Say we compute a k-bit hensel quotient, and replace (a, b) by ( (a + q
b)/2^k, b). Then we get no additional progress by making k larger than
the bitsize difference between a and b; making q b much larer than a is
useless.

So just like the plain Euclidean gcd, useful quotients are pretty small
on average. To get more progress than one or two bits per step, one
needs either a matrix (representing multiple division steps), or a more
general linear combination a <-- k a + m b (like the "accelerated gcd"
by Ken Weber, and with some need to handle spurious factors).

One can gain one bit by using signed values, though, and choose q so
that |q| < 2^{k-1}). Putting k = 3 (as you suggest) and |q| < 4 might
give a speedup over plain binary. If it can be implemented to keep
branches to a minimum.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp

2017-12-06 Thread Niels Möller

paul zimmermann <paul.zimmerm...@inria.fr> writes:

> However, we are still missing a proper (documented) way to distinguish
> mini-gmp from plain gmp. We could use the __MINI_GMP_H__ macro but it is not
> documented.

We could add some documented macro for this purpse, but I'm not sure I
understand why you need it. In nettle, I do things like this:

In configure.ac: 

  AC_ARG_ENABLE(mini-gmp,
AC_HELP_STRING([--enable-mini-gmp], [Enable mini-gmp, used instead of
libgmp.]),,
[enable_mini_gmp=no])

  if test "x$enable_mini_gmp" = xyes ; then
NETTLE_USE_MINI_GMP=1
  else
NETTLE_USE_MINI_GMP=0
  fi

  AC_SUBST([NETTLE_USE_MINI_GMP])

Then in version.h.in (which is subject to configure substitutions, and
unlike config.h, nettle/version.h is a public header file):

  #define NETTLE_USE_MINI_GMP @NETTLE_USE_MINI_GMP@

Finally, in bignum.h, 

  #include "version.h"

  #if NETTLE_USE_MINI_GMP
  # include "mini-gmp.h"
  /* Any other min--gmp-specific hacks */
  #else
  # include 
  #endif

And then I have a couple of #if NETTLE_USE_MINI_GMP spread out in the
rest of the code.

> And I notice that very few of the division functions from mini-gmp correspond
> to those of the GMP documentation. Thus to compile MPFR with mini-gmp we had
> to re-implement mpn_divrem_1, mpn_divrem and mpn_tdiv_qr. We can contribute
> them to mini-gmp if you want.

mini-gmp.h doesn't declare any mpn_*div functions. The mpn division
interface in GMP needs some work, and the code in mini-gmp is closer in
spirit to the new division functions we worked on (must be a decade
ago!), which still haven't made it to the public GMP api.

For compatibility, it would make some sense to add more of the public
mpn division functions to mini-gmp, if they can be done as reasonably
simple wrappers around the existing functions.

> The other functions we had to re-implement are some random functions
> (using srand48/lrand48, thus generating different random sequences, but
> for MPFR this is fine),

Right, I think sophisticated random numbers is out-of-scope for
mini-gmp.

> and finally mpz_dump.

Hmm. It seems this function is declared in gmp.h, but not mentioned in
the manual, and with a discouraging comment at the top of mpz/dump.c.

Any reason you can't switch to always use mpz_out_str instead?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp

2017-12-06 Thread Niels Möller

paul zimmermann <paul.zimmerm...@inria.fr> writes:

>> understand why you need it.
>
> for example, we check the version of GMP is at least 5.0.0:
>
> #if !__MPFR_GMP(5,0,0)
> # error "GMP 5.0.0 or newer is required"
> #endif
>
> where the __MPFR_GMP macro uses __GNU_MP_VERSION, ...

So you can detect that mini-gmp is used with

#ifndef __GNU_MP_VERSION

But that's not the problem?

> Since mini-gmp does not provide any version macro, we can't check
> the version of mini-gmp is good enough.

Is the issue really that you want to distinguish between different
versions of mini-gmp? My recommendation is to including a copy of
mini-gmp with mpfr, and then you know which version you have at any
point, and you can upgrade it in a controlled manner to get new features
or bugfixes.

If we add a version number to mini-gmp, what should it be? The only
"releases" of mini-gmp are gmp releases, should we adopt that, and then
have no well-defined version number for mini-gmp between gmp releases?

I'm afraid I'm still a bit confused, and I don't quite understand what
problem you are aiming to solve with additional defines in mini-gmp.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp

2017-12-06 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> As far as I understand, the goal of mini-gmp is that a user can take a
> copy of those 2 files, stick them in his project, and get a
> self-contained program.

That's right. And preferably with a configure time option to use the
real GMP if available.

> Unless you provide a similar mini-mpfr, your
> user is going to have to install the mpfr dependency anyway, his
> project is not self-contained, so he might as well install the real
> GMP.

Even dropping only the GMP dependency is one thing less that can go
wrong. 

One user of interest is gcc. As far as I understand, it uses mpfr and
mpc (for complex arithmetic) for constant propagation. Which means
pretty small numbers, and the time spent on this arithmetic is likely a
very small proportion of the compile time. Using the real GMP when
easily available is nice, but it's also massive overkill.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Generic get_d_2exp failures

2017-10-28 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> 2) The rounding occurs in the addition in
>
>   weight = 1/MP_BASE_AS_DOUBLE;
>   d = up[size - 1];
>   for (i = size - 2; i >= 0; i--)
> {
>   d += up[i] * weight;
>   weight /= MP_BASE_AS_DOUBLE;
>   if (weight == 0)
> break;
> }
>
> we could make that code uglier to make sure there is no rounding up,
> maybe comparing d+up[i]*weight to d+oldweight, and trying again by
> zeroing out an increasing number of low bits of up[i].

It would be nice if we could find a portable way to add two floating
point values without rounding up. Would something like this work?

  s = a + b; /* Assume a > b */
  r = (s - a) - b;  /* No rounding expected here. */
  if (r > 0)
s -= 2*r;

But doesn't make the tests pass  in my initial testing. I guess it might
be necessary to isolate the high bit of r (possibly be converting to
mp_limb_t ans usign count_leading_zeros), which should correspond to
the lsb of the mantissa (or if it's half the lsb).

And might also need special handling of the case that s, after rounding,
is a power of two.

The variant I tested:

@@ -368,7 +369,13 @@ mpn_get_d (mp_srcptr up, mp_size_t size,
   d = up[size - 1];
   for (i = size - 2; i >= 0; i--)
{
- d += up[i] * weight;
+ double l, s, r;
+ l = up[i] * weight;
+ s = d + l;
+ r = (s - d) - l;
+ if (r > 0)
+   s -= 2*r;
+ d = s;
  weight /= MP_BASE_AS_DOUBLE;
  if (weight == 0)
break;

> 3) Drop the generic path. It hasn't passed the testsuite for a long
> time, can't be that important.

If we keep it, we'd need to (i) fix it, and (ii) test it regularly.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Generic get_d_2exp failures

2017-10-28 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> On Sat, 28 Oct 2017, Niels Möller wrote:
>
>> It would be nice if we could find a portable way to add two floating
>> point values without rounding up. Would something like this work?
>>
>>  s = a + b; /* Assume a > b */
>>  r = (s - a) - b;  /* No rounding expected here. */
>>  if (r > 0)
>>s -= 2*r;
>
> Not sure where the 2*r is coming from. And I am not very optimistic
> that there is such a simple formula that works for any rounding mode,
> though I could easily be wrong.

Sorry for the confusion, I mistakenly thought that since the rounding
error affects the lsb, it would be of the same order. But I now realize
that r may be much smaller, and affect the lsb only after carry
propagation through low (otherwise ignored) mantissa of b.

So on second thought, I don't think we can fix the rounding by only
examining a, b, and s. We could compute rounded_b = b + r, and then we
know that s == a + b_rounded, with no rounding, but that still doesn't
tell us the position of the lsb, only that it's somewhere between the
least signficant 1-bit of b_rounded and the most significant 1-bit of r.

> There is also the question of how portable exactly it needs to be. C99
> nextafter can be helpful.

So 

  if (r > 0)
s = nextafter(s, -infinity)

might work. Except that (i) c99 might not be available on obscure
systems where this code is used, and (ii) we try to avoid depending on
linking with -lm.

> It may be easier to assert that FLT_RADIX==2 and use DBL_MANT_DIG to
> avoid any rounding.

That's probably the sane way to do it.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: udiv_qr_3by2 vs divappr

2018-05-09 Thread Niels Möller

paul zimmermann <paul.zimmerm...@inria.fr> writes:

> if you use udiv_qr_3by2 to divide (say) a 6-limb number by a 3-limb number,
> then in the schoolbook loop you might have {n2, n1} = {d1, d0}, in the case
> where n0 is smaller than the next divisor limb d[-1].
>
> If it is the responsibility of the caller to ensure {n2, n1} < {d1, d0},
> then every caller of udiv_qr_3by2 must deal with that special case.

I think the reason udiv_qr_3by2 and udiv_qrnnd_preinv requires n < d is
that returning GMP_NUMB_MAX for n == d requires one additional check,
and some callers (in particular mpn_div_qr_1 and mpn_div_qr_2) don't
need that.

Tradeoffs may be different for divappr_2.

> In my opinion, it would be simpler to allow {n2, n1} = {d1, d0}, and
> return q=GMP_NUMB_MAX in that case.

I've tried to analyse the problem of divappr_2 sometimes overflowing q
and trying to return GMP_NUMB_MAX + 1. To avoid that, it turns out we
need a check

  if ({n_2, n_1} >= {d_1, d_0} - 2)
q = GMP_NUMB_MAX;

(and that quotient is still approximate, so we can't omit the schoolbook
division adjustment step). May still want to leave that check outside of
divappr_2, to take the computation of {d_1, d_0} - 2 out of the loop. My
current version of the schoolbook division code does something like

  d1m2 = d_1 - (d0 < 2);

  for (...) 
{
   if (UNLIKELY (n2 >= d1m2) && (n2 > d1m2 || (n1 >= d0 - 2)))
 q = GMP_NUMB_MAX;
   else
 q = divappr_2(...);
   submul_1 + adjustment
}

Left to do is strict analysis of the case d0 == 0, the code seems to
work fine without any special case for that, even the computation of r1
(based on the assumption that d0 > 0) is off by one in this case.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: dead code in div_q.c?

2018-04-27 Thread Niels Möller

paul zimmermann <paul.zimmerm...@inria.fr> writes:

>> It would be good to either prove that qh > 0 can't happen, or add a test
>> that will exercise the code handling that case.
>
> yes more generally it would be nice to know for the divappr functions, if the
> exact quotient fits on qn limbs (i.e., the exact qh is 0), can the
> "approximate" qh be 1?

I suspect we may have qh == 0 always in this special context, where we
have additional constraints on N and D.

But in general, I doubt we can promise that the qh return value from
divappr euals what we would get for the correct quotient. I.e., that the
qh return values from div and divappr always match. Seems particularly
unlikely for the mu divappr, which can have an error larger than 1 unit.

Not sure how to come up with an example, though.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: documentation on internals not up to date

2018-04-27 Thread Niels Möller

Marc Glisse <marc.gli...@inria.fr> writes:

> There would be a significant advantage to mpq if we could have a
> non-allocated 1 for the denominator. But indeed, with the current code
> where only some mpz functions would work, it seems safer to document
> that none work.

We could do that internally, even if we don't advertise it for other gmp
users. Then mpq_init wouldn't do any allocation, right? We could have a
single mpq object representing 0/1 with _mp_alloc == 0 for both parts,
and initialize with struct assignment or memcpy.

We would just ensure that mpz realloc keeps supporting thuis case.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

divappr interface

2018-04-27 Thread Niels Möller

Hi, 

after the discussion on div_q, I had another look at divappr, or more
precisely, the mpn_sbpi1_divappr_q function.

It's not entirely obvious from the implementation which limbs of the
{np, nn} argument are actually used. But I think it should access only
the qn+2 upper limbs. For accuracy, it should be sufficient with qn+1
limbs, but I suspect qn+2 is more convenient since we lack a variant of
mpn_submul_1 which ignores the low half of the lowest limb product (we'd
need a function for R -= floor (D * q / B)).

For a call to divappr to produce qn quotient limbs (plus possibly a qh
being zero or one), inputs should be a numerator of qn+2 limbs and a
normalized divisor of at most qn+1 limbs.

As long as qn > dn - 1, we loop computing one exact quotient limb and
eliminating one high limb of the numerator per iteration, decrementing qn.
(This phase would be equivalent to a call to sbpi1_div_qr, I guess).

Once we reach qn = dn - 1, keep looping to produce quotient limbs, but
also discard one limb of dp in each interation, until we in the final
iteration have qn = 1, qn+2 = 3 numerator limbs, and qn+1 = 2 divisor
limbs, i.e., a single udiv_qr_3by2 (where we might consider skipping the
adjustment steps). I haven't done the error analysis, but I would expect
that errors are similar to the current code.

This is analogous to the discussion of mpn_bdiv_q, which we discussed
years ago, where N and Q are of the same size, and D possibly smaller.
The structure with two loops is also similar to the old mpn_divrem
function with fractional limbs; it would be cleaner if we could have a
function doing only the fractional part, i.e., require that dn == qn +
1.

Fixing this should provide a small speedup for mpn_div_q, since 

  new_nn = 2 * qn + 1;
  ...
  cy = mpn_lshift (new_np, np + nn - new_nn, new_nn, cnt);
  ...
  qh = mpn_sbpi1_divappr_q (tp, new_np, new_nn, new_dp, qn + 1, dinv.inv32);

implies that new_nn and the shift are twice as large as they need to be
(new_nn should be only qn+2).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: dead code in div_q.c?

2018-04-27 Thread Niels Möller

ni...@lysator.liu.se (Niels Möller) writes:

> --- a/tests/mpn/t-div.c   Wed Apr 25 07:38:14 2018 +0200
> +++ b/tests/mpn/t-div.c   Wed Apr 25 22:19:17 2018 +0200
> @@ -445,6 +445,7 @@ main (int argc, char **argv)
> alloc = itch + 1;
>   }
> scratch[itch] = ran;
> +   MPN_COPY (qp, junkp, nn - dn + 1);
> mpn_div_q (qp, np, nn, dup, dn, scratch);
> ASSERT_ALWAYS (ran == scratch[itch]);
> ASSERT_ALWAYS (qp[-1] == qran0);  ASSERT_ALWAYS (qp[nn - dn + 1] == 
> qran1);

Committed.

> *** /tmp/extdiff.0aBNfO/gmp.765c2c27523b/mpn/generic/div_q.c  2018-04-25 
> 21:55:51.457769871 +0200
> --- /home/nisse/hack/gmp/mpn/generic/div_q.c  2018-04-25 21:18:01.516501993 
> +0200
> *** mpn_div_q (mp_ptr qp,
> *** 171,184 
> if (cy == 0)
>   qp[qn - 1] = qh;
> !   else if (UNLIKELY (qh != 0))
> ! {
> !   /* This happens only when the quotient is close to B^n and
> !  mpn_*_divappr_q returned B^n.  */
> !   mp_size_t i, n;
> !   n = new_nn - dn;
> !   for (i = 0; i < n; i++)
> ! qp[i] = GMP_NUMB_MAX;
> !   qh = 0;   /* currently ignored */
> ! }
>   }
> else  /* divisor is already normalised */
> --- 171,176 
> if (cy == 0)
>   qp[qn - 1] = qh;
> !   else
> ! ASSERT_ALWAYS (qh == 0);
>   }
> else  /* divisor is already normalised */

And committed this first part (but with ASSERT instead of ASSERT_ALWAYS);.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: documentation on internals not up to date

2018-04-27 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> ... and it's not just a problem of supporting mpq_denref() we should also
> change some mpq_ functions. If someone does init an mpq_t with:
>   mpz_init (mpq_numref (accumulator));
>   mpz_roinit_n (mpq_denref (accumulator), _one, 1);
>
> then a simple, mpq_add (accumulator, accumulator, another_mpq) might fail.

I don't think we should document support for that code. If the user
initializes parts of an mpq with a read-only mpz objects, the user ought
to treat also the mpq object as readonly.

That said, it sounds reasonable to me to use an an unallocated one for
the internal initialization of the denominator. Then we have to support
that the user accesses it and assigned to it like any other mpz object.
But that doesn't mean that we have to promise that mpz objects the user
creates using ro_init behaves in the same friendly way.

E.g., in case we for some reason change the internals to break
assignment to non-zero unallocated objects, we could hack mpq_denref to
do a real allocation if needed (except that the macro is documented to
work on a const mpq_t). Or go back to using an allocated one.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: divappr interface

2018-04-27 Thread Niels Möller

ni...@lysator.liu.se (Niels Möller) writes:

> Once we reach qn = dn - 1, keep looping to produce quotient limbs, but
> also discard one limb of dp in each interation, until we in the final
> iteration have qn = 1, qn+2 = 3 numerator limbs, and qn+1 = 2 divisor
> limbs, i.e., a single udiv_qr_3by2 (where we might consider skipping the
> adjustment steps). I haven't done the error analysis, but I would expect
> that errors are similar to the current code.

In the current mpn_sbpi1_divappr_q, there's a curious flag (or mask)
variable used in this loop. It's initially all ones, cleared if we ever
fail to cancel the top limb of the current partial remainder. When the
flag is cleared, remaining quotient limbs are set to GMP_NUMB_MAX.

Torbjörn, Paul, do you remember the analysis behind this? (Code is since
2009...).

I would guess it might happen that even if higher quotient limbs are all
correct, we might get non-zero high limb in

  partial remainder - GMP_NUMB_MAX * truncated D 

If we set the rest of the quotient limbs to GMP_NUMB_MAX, then the
quotient should be large enough thanks to the low end divisor limbs
which we ignored in the truncation.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

udiv_qr_3by2 vs divappr

2018-04-28 Thread Niels Möller

It's fun looking at clever things you did a decade ago, and try to
improve...

I just realized that our 3/2 division, udiv_qr_3by2, work horse for the
linear work of schoolbok division, has one more input than it needs.

We would get a useful quotient if we always passed GMP_NUMB_MAX for the
n0 argument. Which wouldn't provide any useful gain by itself (and
require adjustments where the remainder is used).

But it shows that we could have a function with inputs n2, n1, d1, d0,
computing a suitable quotient approximation. Fixing n0 as above would
mean that we always return

  q = f(n2, n1, d1, d0) = floor ({n2, n1, B-1} / {d1, d0})

but we could allow slightly more freedom in what we return.

Inputs are a two limb normlized divisor, {d1, d0}, d1 >= B/2, and two
numerator words, {n2, n1}, which must be less than {d1, d0}. I think
we'd want the same precomputed inverse as for 3/2.

It computes an single word approximate quotient q, roughly B {n2,n1} / {d1,
d0}, which, when applied to larger numbers as part of schoolbook
division, is never too small, at most one too large, and highly likely to
be correct.

If convenient, the function could also return a single word remainder 

  r = {n1,n0} - q d1 - mulhi(q, d0), 

I think such an r would have to be in the range 0 <= r <= d1, with some
subteties on when r == d1 is permissible. We might also want to return
mullo(q, d0), if that's easily available.

Potential I see:

* It might be possible to implement this cheaper than udiv_qr_3by2,
  omitting most of the book-keeping and carry propagation involving the
  low remainder word r0.

* One limb less to extract and handle specially in the schoolbook
  division loops. It would be particularly nice if we try to write a
  schoolbook division that doesn't require arguments to be normalized up
  front, but instead just normalizes top limbs on the fly.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: documentation on internals not up to date

2018-05-05 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> What about the following?
>
> diff -r 85eda8a9f6e2 doc/gmp.texi
> --- a/doc/gmp.texiFri May 04 21:43:34 2018 +0200
> +++ b/doc/gmp.texiFri May 04 23:12:34 2018 +0200
> @@ -10155,7 +10155,7 @@
>  @item @code{_mp_size}
>  The number of limbs, or the negative of that when representing a negative
>  integer.  Zero is represented by @code{_mp_size} set to zero, in which case
> -the @code{_mp_d} data is unused.
> +the @code{_mp_d} data is undefined.
>
>  @item @code{_mp_d}
>  A pointer to an array of limbs which is the magnitude.  These are stored
> @@ -10164,17 +10164,21 @@
>  significant.  Whenever @code{_mp_size} is non-zero, the most significant
> limb
>  is non-zero.
>
> -Currently there's always at least one limb allocated, so for instance
> -@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch
> -@code{_mp_d[0]} unconditionally (though its value is then only wanted if
> -@code{_mp_size} is non-zero).
> +Currently there's always at least one readable limb, so for instance
> +@code{mpz_get_ui} can fetch @code{_mp_d[0]} unconditionally (though its
> +value is undefined if @code{_mp_size} is zero).
>
>  @item @code{_mp_alloc}
>  @code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d},
> -and naturally @code{_mp_alloc >= ABS(_mp_size)}.  When an @code{mpz} routine
> +and normally @code{_mp_alloc >= ABS(_mp_size)}.  When an @code{mpz} routine
>  is about to (or might be about to) increase @code{_mp_size}, it checks
>  @code{_mp_alloc} to see whether there's enough space, and reallocates if
> not.
>  @code{MPZ_REALLOC} is generally used for this.
> +
> +@code{mpz_t} variables initialised with the @code{mpz_roinit_n} function
> +or the @code{MPZ_ROINIT_N} macro have @code{_mp_alloc = 0} but can have a
> +non-zero @code{_mp_size}.  They can only be used as read-only constants.
> +See @ref{Integer Special Functions} for details.
>  @end table
>
>  The various bitwise logical functions like @code{mpz_and} behave as if

Looks good to me.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: _ptr and _srcptr types

2018-05-19 Thread Niels Möller

Vincent Lefevre <vinc...@vinc17.net> writes:

> I've just noticed that GMP provides mpz_inits, so that mpz_ptr is
> necessary: Since type checking cannot be done with variadic functions,
> one must provide types compatible with mpz_ptr, or the behavior is
> undefined. This is not an issue for mpz_t variables, but the last
> argument needs to be (mpz_ptr) 0 or equivalent.

Isn't (void*) 0 good enough to get defined behavior? (For which NULL is
an alias in most C (but not C++) compilers).

Digression:

As far as I've understood, the C standard requires NULL to expand to
either 0 or ((void*)0), and all sane compilers do the latter. While the
C++ standard requires NULL to expand to 0, which means that C++ code
calling mpz_inits can't use NULL to terminate the argument list, it
should probably use nullptr instead. 

(And I think at least g++ tries to work around this strange C++ quirk by
expanding NULL to some magic *slightly* different from 0, which will
produce some warnings if it used as an integer rather then pointer).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp mpz_{get,set}_d not fully compatible with GMP

2018-05-19 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> diff -r 41e50c4fdc46 bootstrap.c
> --- a/bootstrap.c   Wed May 16 08:36:03 2018 +0200
> +++ b/bootstrap.c   Fri May 18 07:24:52 2018 +0200
> @@ -29,6 +29,7 @@
>  see https://www.gnu.org/licenses/.  */
>
>
> +#define DONT_USE_FLOAT_H 1
>  #include "mini-gmp/mini-gmp.c"
>
>  #define MIN(l,o) ((l) < (o) ? (l) : (o))
> diff -r 41e50c4fdc46 mini-gmp/mini-gmp.c
> --- a/mini-gmp/mini-gmp.c   Wed May 16 08:36:03 2018 +0200
> +++ b/mini-gmp/mini-gmp.c   Fri May 18 07:24:52 2018 +0200
> @@ -50,6 +50,9 @@
>
>  #include "mini-gmp.h"
>
> +#if !defined(DONT_USE_FLOAT_H)
> +#include 
> +#endif

Makes sense to me, provided that the define also excludes whatever
mini-gmp functions depend on float.h (which would be get_d and set_d,
possibly cmp_d), and is named accordingly.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp mpz_{get,set}_d not fully compatible with GMP

2018-05-20 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> float.h is only used for
>
> +#if defined(DBL_MANT_DIG) && FLT_RADIX == 2
> +#define GMP_DBL_MANT_BITS DBL_MANT_DIG
> +#else
> +#define GMP_DBL_MANT_BITS (53)
> +#endif
>
> This constant is used only by the new version of get_d, to convert the
> correct number of bits and avoid rounding.
> I mean, the functions do not "depend on" float.h , and we don't need to
> exclude them. Maybe we should decide for a different default value...

You could consider making this into small step towards modularizing
mini-gmp. Something like

#ifndef MINI_GMP_FLOAT_SUPPORT
#define MINI_GMP_FLOAT_SUPPORT 1
#endif

#if MINI_GMP_FLOAT_SUPPORT
#include 
#endif

...

#if MINI_GMP_FLOAT_SUPPORT
#if FLT_RADIX == 2
#define GMP_DBL_MANT_BITS DBL_MANT_DIG
#else
/* What are our options?

   Fail as unsupported? Display warning? Or document expected breakage
   on systems for non-binary floats? Or approximate as DBL_MANT_DIG *
   log2(FLT_RADIX) ? */
#define GMP_DBL_MANT_BITS (53)
#endif

I.e., use float.h constants unconditionally whenever float support is
enabled. And gmp/bootstrap.c would then #define MINI_GMP_FLOAT_SUPPORT
0.

Independent of this suggestion, I have two minor comments on the your
change (43659273c9cf): I think the DONT_USE_FLOAT_H macro should have a
GMP or MINI_GMP prefix. And it would be nice with a test, e.g, based on
your sample program converting integers 2^i +/- 1 to double and
comparing the results (before your fix, mini-gmp rounded to nearest,
producing identical doubles for large enough i).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: _ptr and _srcptr types

2018-05-20 Thread Niels Möller

Vincent Lefevre <vinc...@vinc17.net> writes:

> No, pointers to different types may have different representations,
> and in particular, different sizes.

Hmm. I was aware that *function* pointers may use different
representation and size than data pointers.

But not that a data pointer can use different representation than void*.
If that really is the case, most occurences of NULL or (void*)0 in the
argument list in the call of a varargs function would be undefined
behavior, right?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: _ptr and _srcptr types

2018-05-16 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> I also think usage of mpz_roinit_n might lead to memory leaks, triggered
> by mpz_init (or lack of mpz_clear) before its use.  We need to point
> this out.

The name suggests that it part of the mpz_init family of functions,
which should only be used on uninitialized mpz objects. But this could be
made more explicit, and we should also mention that mpz_ronit_n
shouldn't be followed by mpz_clear.

>   I believe that both types will not change their role or definition in the
>   near future, as such they can be documented.
>   I'm not sure about: how we should explain how to use them.
>
> That is a challenge.

It shouldn't be so difficult for users familiar with the fine points of
arrays and pointers in C. mpz_t is a typedef for a size 1 array
containing the internal struct, and mpz_ptr is a pointer to the same
struct, i.e., the type that an mpz_t "decays" to in expression context.

> (Should we provide similar types for mpq and mpf?)

Yes. In m opinion, every advertised typedef of an array type should have
a corresponding advertised pointer type.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: Any interest in multi-threaded code?

2018-05-15 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> I merged the current head with that 10 years old repository, try again.

Thanks for taking care of that!

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: _ptr and _srcptr types

2018-05-15 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Emmanuel Thomé <emmanuel.th...@inria.fr> writes:
>
>   I've personally always used the mpz_ptr and mpz_srcptr in functions that
>   take mpz arguments, taking the liberty to use this undocumented type.
>
> I suppose we could declare these.  I'd like to hear from the rest of the
> GMP "core team".

My fairly strong opinion is that we should document them. They are
important for user code that needs to handle pointers to mpz_t values.
One simple use case:

  void foo (mp_srcptr a, mp_srcptr b) 
  {
if (mpz_cmp (a, b) < 0)
  {
mp_srcptr t = a;
a = b;
b = t;
  }
... code that requires a >= b ...
  }

This gets a lot more awkward if the function has to be declared using
documented types, as

  void foo (const mpz_t a, const mpz_t b)

Simplest alternative I see would be to copy into local non-const mpz_t
variables and use mpz_swap, and that's an extra copy I don't want to
force users to make.

Functions returning pointers to an mpz_t is another important case
(mpz_roinit_n is one (unusual) example, spotted by Emmanuel). mpq_numref
and mpq_denref too, if they were implemented as (inline) functions
rather than macros.

I don't think need for handling pointers is that common in GMP user
code, but not so obscure that it should be ignored and unsupported.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: _ptr and _srcptr types

2018-05-26 Thread Niels Möller

Emmanuel Thomé <emmanuel.th...@inria.fr> writes:

> diff -r aab8a010d10f -r 14b2416f45de doc/gmp.texi
> --- a/doc/gmp.texiMon May 21 00:13:27 2018 +0200
> +++ b/doc/gmp.texiThu May 24 00:31:28 2018 +0200
> @@ -1982,6 +1982,42 @@
>  Also, in general @code{mp_bitcnt_t} is used for bit counts and ranges, and
>  @code{size_t} is used for byte or character counts.
>  
> +@sp 1
> +
> +@cindex Pointer types
> +@tindex @code{mpz_ptr}
> +@tindex @code{mpz_srcptr}
> +@tindex @code{mpq_ptr}
> +@tindex @code{mpq_srcptr}
> +@tindex @code{mpf_ptr}
> +@tindex @code{mpf_srcptr}
> +@tindex @code{gmp_randstate_ptr}
> +@tindex @code{gmp_randstate_srcptr}
> +Internally, GMP data types such as @code{mpz_t} are defined as
> +one-element arrays, whose element type is part of the GMP
> +internals (@pxref{Internals}). In some cases, it may be desirable to 
> manipulate
> +pointers to this element type. Types such as @code{mpz_ptr} and
> +@code{mpz_srcptr} are defined for this purpose. 

I think I'd like to add something like

  When an array is used as a function argument in C, it is not passed by
  value, instead its value is a pointer to the first element. In C
  jargon, this is sometimes referred to as the array "decaying" to a
  pointer. For GMP types like mpz_t, that means that the function called
  gets a pointer to the caller's mpz_t value, which is why no explicit &
  operator is needed when passing output arguments. [this is related to
  the section Parameter Conventions]

  GMP defines names for these pointer types, e.g., mpz_ptr corresponding
  to mpz_t, and mpz_srcptr corresponding to const mpz_t. Most functions
  don't need to use these pointer types directly; it works fine to
  declare a function using the mpz_t or const mpz_t as the argument
  types, the same "pointer decay" happens in the background regardless.

  Occasionally, it is useful to manipulate pointers directly, e.g, to
  conditionally swap *references* to a function's inputs without
  changing the *values* as seen by the caller, or returning a pointer to
  an mpz_t which is part of a larger structure. For these cases, the
  pointer types are necessary. And a mpz_ptr can be passed as argument
  to any GMP function declared to take an mpz_t argument.

> +equivalent to the following code, which is given for illustratory
> +purposes only:
> +
> +@example
> +typedef some_internal_data_type mpz_t[1];
> +typedef some_internal_data_type * mpz_ptr;
> +typedef const some_internal_data_type * mpz_srcptr;
> +@end example

I think it's confusing with a made-up name for the internal types, but
actual names for types defined. I'd prefer to either use the actual
internal name __mpz_struct instead of some_internal_data_type above, or
change to the obviously made up name foo consistently, like

typedef foo_internal foo_t[1];
typedef foo_internal * foo_ptr;
typedef const foo_internal * foo_srcptr;

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp mpz_{get,set}_d not fully compatible with GMP

2018-05-26 Thread Niels Möller

"Marco Bodrato" <bodr...@mail.dm.unipi.it> writes:

> Il Dom, 20 Maggio 2018 9:57 pm, Niels Möller ha scritto:

>> You could consider making this into small step towards modularizing
>> mini-gmp. Something like
>
>> #ifndef MINI_GMP_FLOAT_SUPPORT
>
>> ...
>
> The idea is interesting. How should we modularize? Should we split mini-
> in many files (each #include-ing the ones it depends on) or should we have
> a single file with selector like the one you propose? The only module we
> have, currently is mini-mpq, as a file apart.

I think it should stay as one file. Not sure if it's worth the effort,
but I was thinking of trying to divide mini-gmp into modules including
things like

* Add and sub.

* Bitwise logic functions

* Mul

* Div

* Base conversion for power-of-2 bases

* Base conversion for non-power-of-2 bases, in particular, 10

* Float conversion

* Import/export

* Various higher level functions, roots, gcd, primality testing, ...

We'd need to sort out dependencies between the modules and likely
maintain that graph manually.

By default, compiling mini-gmp would include everything, but if the user
does something like

#define MINI_GMP_ENABLE_BASE_CONVERSION_POWER_OF_TWO 1
#define MINI_GMP_ENABLE_MUL 1
#include "mini.gmp.c"

only those routines and their dependencies would be included. We'd need a
pretty large block of #ifs at the top to implement the dependencies. So
the drawback is the complexity of the needed #ifdeffery, and the testing
thereof.

Uncler to me how far it's worth going in dividing code into smaller
modules. To have one or (even two, in the case of get_str) module per
function will not be easy to maintain.

Alternative might be to recommend applications that care about binary
size to compile mini-gmp.c with -ffunction-sections and let the linker
sort it out, or rely on link-time-optimization eliminating unused code.
But I doubt that LTO will detect, e.g., that mpz_get_str is only ever
called with power-of-two arguments, and eliminate the unused
mpn_get_str_other together with the division code it depends on.

> I'm already tempted to join at least mini-gmp.h and mini-mpq.h

Makes sense to me. Could consider renaming mini-gmp.c to mini-mpz.c, but
doesn't matter so much.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: mini-gmp mpz_{get,set}_d not fully compatible with GMP

2018-05-20 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> Would it be beneficial to add some non-IEEE host to the GMP autobuild
> setup?  We could revive vax using simh.

I think so. Does current gcc support VAX, or what compiler would we use?

Can VAX be supported using the generic code, without HAVE_DOUBLE_VAX_D?
Testing the generic code seems more important than testing VAX-specific
optimizations.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: udiv_qr_3by2 vs divappr

2018-05-02 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> ni...@lysator.liu.se (Niels Möller) writes:
>
>   Yes. By one if divappr_2 can produce a useful single-limb
>   remainder, otherwise by two.
>
> I am very curious about the result of this work.  A larger submul_1
> tripcount is not for free, but the current 3/2 quotient approximation is
> also not for free.

The main case of the current schoolbook iteration looks like

  udiv_qr_3by2 (q, n1, n0, n1, np[1], np[0], d1, d0, dinv);

  cy = mpn_submul_1 (np - dn, dp, dn, q);

  cy1 = n0 < cy;
  n0 = (n0 - cy) & GMP_NUMB_MASK;
  cy = n1 < cy1;
  n1 = (n1 - cy1) & GMP_NUMB_MASK;
  np[0] = n0;

  if (UNLIKELY (cy != 0))
...

So I would hope that increased submul_1 tripcount would be reasonably
cheap compared to these n0 and n1 updates, even if submul_1 might redo
one or two multiplications already done inside the quotient
approximation.

> I suspect that significant speedup would come from early, speculative
> quotient approximation.  We do its equivalent for some asm redc_1 and
> its sibling mpn_sbpi1_bdiv_r (see e.g. mpn/x86_64/zen/sbpi1_bdiv_r.asm).
> This was a significant speedup.

I'll try to keep that in mind, but it's going to be a bit tricky.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

Re: udiv_qr_3by2 vs divappr

2018-05-01 Thread Niels Möller

t...@gmplib.org (Torbjörn Granlund) writes:

> So the submul_1 size agument will increase slightly?

Yes. By one if divappr_2 can produce a useful single-limb
remainder, otherwise by two.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
gmp-devel mailing list
gmp-devel@gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel

< 1 2 3 4 5 6 7 8 >

301 - 400 of 723 matches

Mail list logo