On Thu, 2022-01-20 at 14:41 +0100, Francesc Alted wrote: > On Wed, Jan 19, 2022 at 7:48 PM Francesc Alted <fal...@gmail.com> > wrote: > > > On Wed, Jan 19, 2022 at 6:58 PM Stanley Seibert > > <sseib...@anaconda.com> > > wrote: > > > > > Given that this seems to be Linux only, is this related to how > > > glibc does > > > large allocations (>128kB) using mmap()? > > > > > > https://stackoverflow.com/a/33131385 > > > > > > > That's a good point. As MMAP_THRESHOLD is 128 KB, and the size of > > `z` is > > almost 4 MB, mmap machinery is probably getting involved here. > > Also, as > > pages acquired via anonymous mmap are not actually allocated until > > you > > access them the first time, that would explain that the first > > access is > > slow. What puzzles me is that the timeit loops access `z` data > > 3*10000 > > times, which is plenty of time for doing the allocation (just > > should > > require just a single iteration). > > > > I think I have more evidence that what is happening here has to see > of how > the malloc mechanism works in Linux. I find the next explanation to > be > really good: > > https://sourceware.org/glibc/wiki/MallocInternals >
Thanks for figuring this out! It has been bugging me a lot before. So it rather depends on how `malloc` works, and not the kernel. It is surprising how "random" this can look, but I suppose some examples just happen to sit almost exactly at the threshold. It might be interesting if we could tweak `mallopt` parameters for typical NumPy usage. But unless it is very clear, maybe a standalone module is better? > In addition, this excerpt of the mallopt manpage ( > https://man7.org/linux/man-pages/man3/mallopt.3.html) is very > significant: <snip> > All in all, this is testimonial of how much memory handling can > affect > performance in modern computers. Perhaps it is time for testing > different > memory allocation strategies in NumPy and come up with suggestions > for > users. You are probably aware, but Matti and Elias now added the ability to customize array data allocation in NumPy, so it should be straight forward to write a small package/module that tweaks the allocation strategy here. Cheers, Sebastian > > Francesc > > > > > > > > > > > > > > > > On Wed, Jan 19, 2022 at 9:06 AM Sebastian Berg < > > > sebast...@sipsolutions.net> wrote: > > > > > > > On Wed, 2022-01-19 at 11:49 +0100, Francesc Alted wrote: > > > > > On Wed, Jan 19, 2022 at 7:33 AM Stefan van der Walt > > > > > <stef...@berkeley.edu> > > > > > wrote: > > > > > > > > > > > On Tue, Jan 18, 2022, at 21:55, Warren Weckesser wrote: > > > > > > > expr = 'z.real**2 + z.imag**2' > > > > > > > > > > > > > > z = generate_sample(n, rng) > > > > > > > > > > > > 🤔 If I duplicate the `z = ...` line, I get the fast result > > > > > > throughout. > > > > > > If, however, I use `generate_sample(1, rng)` (or any other > > > > > > value > > > > > > than `n`), > > > > > > it does not improve matters. > > > > > > > > > > > > Could this be a memory caching issue? > > > > > > > > > > > > > > Yes, it is a caching issue for sure. We have seen similar > > > > random > > > > fluctuations before. > > > > You can proof that it is a cache page-fault issue by running it > > > > with > > > > `perf --stat`. I did this twice, once with the second loop > > > > removed > > > > (page-faults only): > > > > > > > > 28333629 page-faults # 936.234 K/sec > > > > 28362718 page-faults # 1.147 M/sec > > > > > > > > The number of page faults is low. Running only the second one > > > > (or > > > > running the first one only once, rather), gave me: > > > > > > > > 15024 page-faults # 1.837 K/sec > > > > > > > > So that is the *reason*. I had before tried to figure out why > > > > the page > > > > faults differ too much, or if we can do something about it. > > > > But I > > > > never had any reasonable lead on it. > > > > > > > > In general, these fluctuations are pretty random, in the sense > > > > that > > > > unrelated code changes and recompilation can swap the behaviour > > > > easily. > > > > As Andras noted in that he sees the opposite. > > > > > > > > I would love to have an idea if there is a way to figure out > > > > why the > > > > page-faults are so imbalanced between the two. > > > > > > > > (I have not looked at CPU cache misses this time, but since > > > > page-faults > > > > happen, I assume that should not matter?) > > > > > > > > Cheers, > > > > > > > > Sebastian > > > > > > > > > > > > > > > > > > I can also reproduce that, but only on my Linux boxes. My > > > > > MacMini > > > > > does not > > > > > notice the difference. > > > > > > > > > > Interestingly enough, you don't even need an additional call > > > > > to > > > > > `generate_sample(n, rng)`. If one use `z = np.empty(...)` and > > > > > then do > > > > > an > > > > > assignment, like: > > > > > > > > > > z = np.empty(n, dtype=np.complex128) > > > > > z[:] = generate_sample(n, rng) > > > > > > > > > > then everything runs at the same speed: > > > > > > > > > > numpy version 1.20.3 > > > > > > > > > > 142.3667 microseconds > > > > > 142.3717 microseconds > > > > > 142.3781 microseconds > > > > > > > > > > 142.7593 microseconds > > > > > 142.3579 microseconds > > > > > 142.3231 microseconds > > > > > > > > > > As another data point, by doing the same operation but using > > > > > numexpr > > > > > I am > > > > > not seeing any difference either, not even on Linux: > > > > > > > > > > numpy version 1.20.3 > > > > > numexpr version 2.8.1 > > > > > > > > > > 95.6513 microseconds > > > > > 88.1804 microseconds > > > > > 97.1322 microseconds > > > > > > > > > > 105.0833 microseconds > > > > > 100.5555 microseconds > > > > > 100.5654 microseconds > > > > > > > > > > [it is rather like a bit the other way around, the second > > > > > iteration > > > > > seems a > > > > > hair faster] > > > > > See the numexpr script below. > > > > > > > > > > I am totally puzzled here. > > > > > > > > > > """ > > > > > import timeit > > > > > import numpy as np > > > > > import numexpr as ne > > > > > > > > > > > > > > > def generate_sample(n, rng): > > > > > return rng.normal(scale=1000, > > > > > size=2*n).view(np.complex128) > > > > > > > > > > > > > > > print(f'numpy version {np.__version__}') > > > > > print(f'numexpr version {ne.__version__}') > > > > > print() > > > > > > > > > > rng = np.random.default_rng() > > > > > n = 250000 > > > > > timeit_reps = 10000 > > > > > > > > > > expr = 'ne.evaluate("zreal**2 + zimag**2")' > > > > > > > > > > z = generate_sample(n, rng) > > > > > zreal = z.real > > > > > zimag = z.imag > > > > > for _ in range(3): > > > > > t = timeit.timeit(expr, globals=globals(), > > > > > number=timeit_reps) > > > > > print(f"{1e6*t/timeit_reps:9.4f} microseconds") > > > > > print() > > > > > > > > > > z = generate_sample(n, rng) > > > > > zreal = z.real > > > > > zimag = z.imag > > > > > for _ in range(3): > > > > > t = timeit.timeit(expr, globals=globals(), > > > > > number=timeit_reps) > > > > > print(f"{1e6*t/timeit_reps:9.4f} microseconds") > > > > > print() > > > > > """ > > > > > > > > > > _______________________________________________ > > > > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > > > > To unsubscribe send an email to > > > > > numpy-discussion-le...@python.org > > > > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > > > > Member address: sebast...@sipsolutions.net > > > > > > > > _______________________________________________ > > > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > > > To unsubscribe send an email to > > > > numpy-discussion-le...@python.org > > > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > > > Member address: sseib...@anaconda.com > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > > To unsubscribe send an email to numpy-discussion-le...@python.org > > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > > Member address: fal...@gmail.com > > > > > > > > > -- > > Francesc Alted > > > > > _______________________________________________ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com