[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-12 Thread STINNER Victor
STINNER Victor added the comment: The optimization of 2**n looks to be only useful for very large value of n, result larger than 1 MB. This use case is very rare, and you should probably use another library (GMP, numpy, or something else) for such large numbers. I close the issue. --

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread STINNER Victor
New submission from STINNER Victor: Attached patch modifies long_lshift() to allocate the result using calloc() instead of malloc(). The goal is to avoid allocating physical memory for the least significat digits (zeros). According to my tests in issue #21233 (calloc), Linux and Windows have

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread STINNER Victor
STINNER Victor added the comment: bench_long_rshift.py: Microbenchmark for int int (lshift) operation. Results on Linux: Common platform: Bits: int=32, long=64, long long=64, size_t=64, void*=64 Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)',

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread Josh Rosenberg
Josh Rosenberg added the comment: Looks like you forgot to actually use the use_calloc parameter you put in the long_alloc prototype. It accepts it, but it's never used or passed to another function. So right now, the only difference in behavior is that there is an extra layer of function

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread Josh Rosenberg
Josh Rosenberg added the comment: Given your benchmarks show improvements (you posted while I was typing my last comment), I'm guessing it's just the posted patch that's wrong, and your local changes actually use use_calloc? -- ___ Python tracker

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread STINNER Victor
STINNER Victor added the comment: Without the patch, 1 (2**29) allocates 69.9 MB. With the patch, 1 (2**29) allocates 0.1 MB (104 KB). Without the patch, $ ./python Python 3.5.0a0 (default:5b0fda8f5718, May 2 2014, 22:47:06) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux import os

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread STINNER Victor
STINNER Victor added the comment: Looks like you forgot to actually use the use_calloc parameter you put in the long_alloc prototype. It accepts it, but it's never used or passed to another function. Oh f###, you're right. See new patch. I ran again the new benchmark: my (updated) patch

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread STINNER Victor
STINNER Victor added the comment: Without the patch, 1 (2**29) allocates 69.9 MB. With the patch, 1 (2**29) allocates 0.1 MB (104 KB). This is still true with long_lshift2.patch (except that in my new test, it allocates 196 kB). -- ___ Python

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread Josh Rosenberg
Josh Rosenberg added the comment: While you're doing this, might it make sense to add a special case to long_pow so it identifies cases where a (digit-sized) value with an absolute value equal to a power of 2 is being raised to a positive integer exponent, and convert said cases to equivalent

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread Josh Rosenberg
Josh Rosenberg added the comment: I swear, I need to refresh before I post a long comment. If this is slowing everything down a little just to make 1 (2 ** 29) faster (and did you really mean 1 (1 29) ? :-) ), then I'd say drop it. -- ___ Python

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread Josh Rosenberg
Josh Rosenberg added the comment: One possible way to salvage it: Have you considered declaring long_alloc as Py_LOCAL_INLINE, or, now that I've checked #5553, a macro for long_alloc, so it gets inlined, and doesn't add the check overhead to everything (since properly inlining with a constant

[issue21419] Use calloc() instead of malloc() for int int (lshift)

2014-05-02 Thread Josh Rosenberg
Josh Rosenberg added the comment: And now that I'm thinking about it, the probable cause of the slowdown is that, for any PyLongObject that's smaller than PAGESIZE (give or take), using Calloc means calloc is just calling memset to zero the PyLongObject header and the used part of the high