Re: Stack allocation
ni...@lysator.liu.se (Niels Möller) writes: t...@gmplib.org (Torbjörn Granlund) writes: I decided to lower the TMP_SALLOC limit to a bit under 2^15 from the previous 2^16. What's the a relative cost of allocation vs simple operations like mpn_add_n? For 2^15 limit, that's 512 limbs (on 64-bit). I guess overhead of a malloc call might be comparable to an mpn_add_n with n = 512, but it ought to be a lot faster than, e.g., an n = 256 mpn_mul_n. It might be the case that malloc's performance vary a lot between implementations. I wouldn't be surprised if BSD and GNU malloc are several times faster than malloc from the various non-free Unices. I don't expect the free mallocs to need even near time(mpn_add_n(512)). But perhaps they need 10% of that, which is still too much for GMP. Fortunately, I don't think we make dynamic allocations for O(n) operations. Would it make sense to lower the limit further to, say, 128 limbs? Who knows. I played with that, but it does not decrease stack usage as much as one might expect (only 20% as measured by the test suite). Lowering the limit adds gradually more overhead but gives rapidly diminishing returns in stack use. Nice! That seems very reasonable on current desktop and server machines, but it might still be a bit large if people use gmp on embedded systems. Perhaps alloca is not useful there? There is currently one oddity in that the limit is more limbs on 32-bit machines than on 64-bit machines. Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Stack allocation
t...@gmplib.org (Torbjörn Granlund) writes: ni...@lysator.liu.se (Niels Möller) writes: Would it make sense to lower the limit further to, say, 128 limbs? Who knows. I played with that, but it does not decrease stack usage as much as one might expect (only 20% as measured by the test suite). I see. Nice! That seems very reasonable on current desktop and server machines, but it might still be a bit large if people use gmp on embedded systems. Perhaps alloca is not useful there? What stack usage do you get if you disable use of stack allocation? Regards, /Niels -- Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26. Internet email is subject to wholesale government surveillance. ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Stack allocation
ni...@lysator.liu.se (Niels Möller) writes: What stack usage do you get if you disable use of stack allocation? A good question. My measurements are blunt, using 'ulimit -s'. I don't know how to measure it accurately without instrumenting the code. I assume that GMP will use around 1 KiB, since its recursion isn't very deep. Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Stack allocation
I made the automated GMP nightbuilds use at most 512 KiB. Now I realise that the testsuite needs might both overestimate and underestimate the actual requirements. The overestimate will come from tests/mpn where we call functions outside their normal operand size envelope. Underestimation might happen because we don't use large enough operands. I tried lowering the TMP_SALLOC limit from 2^16 to 2^15 and 2^14, and checked the resulting stack usage, For the current limit 2^16, the use is about 512 KiB, depending a little on the various THRESHOLDs. For 2^15 the maximum use dropped to about 256 KiB, i.e., linear as expected. For 2^14 the maximum use didn't drop much at all, since here some direct TMP_SALLOC allocations hurt. Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Stack allocation
I decided to lower the TMP_SALLOC limit to a bit under 2^15 from the previous 2^16. With that change and a couple of other allocaton changes, GMP's now using less than 300 KiB of stack. The nightly builds attempt to enforce this limit. Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Stack allocation
This started as a thread in gmp-discuss about crashes due to stack overflow. I modified the TMP_SALLOC macro in gmp-impl.h to print its allocation argument. I did this as I suspected that we sometimes invoke the SALLOC form inappropriately for huge allocation. Below is a sample output. We clearly have some bad allocation code, since TMP_SALLOC should only be used for small allocations. ALLOC:721952 ALLOC:696992 PASS: t-mul -- ALLOC:664480 ALLOC:664352 ALLOC:688288 ALLOC:688288 ALLOC:619296 ALLOC:619296 PASS: t-tdiv -- ALLOC:642208 ALLOC:643744 ALLOC:642208 ALLOC:643744 ALLOC:642208 PASS: t-gcd -- ALLOC:667424 ALLOC:667424 ALLOC:667424 ALLOC:661664 ALLOC:661664 ALLOC:661664 ALLOC:661664 ALLOC:661664 ALLOC:661664 ALLOC:667424 PASS: reuse -- ALLOC:672544 ALLOC:652448 PASS: t-remove Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel
Re: Stack allocation
t...@gmplib.org (Torbjörn Granlund) writes: I modified the TMP_SALLOC macro in gmp-impl.h to print its allocation argument. I did this as I suspected that we sometimes invoke the SALLOC form inappropriately for huge allocation. After adding printing of __FILE__ and __LINE__ to the diagnostics code, I identified two bad TMP_SALLOC_LIMBS invocations in mpn/generic/mul.c. These are now patched. The code can still be improved in many ways, including trimming of allocation. Torbjörn Please encrypt, key id 0xC8601622 ___ gmp-devel mailing list gmp-devel@gmplib.org https://gmplib.org/mailman/listinfo/gmp-devel