On Tue, Jul 9, 2019 at 9:46 AM Tim Peters wrote:
>
> > At last, all size classes has1~3 used/cached memory blocks.
>
> No doubt part of it, but hard to believe it's most of it. If the loop
> count above really is 10240, then there's only about 80K worth of
> pointers in the final `buf`.
You
[Inada Naoki , trying mimalloc]
>>> Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
>>> benchmarks. I will look it later.
>> ...
>> $ ./python -m pyperf compare_to pymalloc-mem.json mimalloc-mem.json -G
>> Slower (60):
>> - logging_format: 10.6 MB +- 384.2 kB -> 27.2 MB
On Thu, Jul 4, 2019 at 11:32 PM Inada Naoki wrote:
>
> On Thu, Jul 4, 2019 at 8:09 PM Antoine Pitrou wrote:
> >
> > Ah, interesting. Were you able to measure the memory footprint as well?
> >
>
> Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
> benchmarks. I will look
[Victor Stinner ]
> I guess that INADA-san used pyperformance --track-memory.
>
> pyperf --track-memory doc:
> "--track-memory: get the memory peak usage. it is less accurate than
> tracemalloc, but has a lower overhead. On Linux, compute the sum of
> Private_Clean and Private_Dirty memory
I found calibrated loop count is not stable so memory usage is very different
in some benchmarks.
Especially, RAM usage of logging benchmark is very relating to loop count:
$ PYTHONMALLOC=malloc LD_PRELOAD=$HOME/local/lib/libmimalloc.so
./python bm_logging.py simple --track-memory --fast
I guess that INADA-san used pyperformance --track-memory.
pyperf --track-memory doc:
"--track-memory: get the memory peak usage. it is less accurate than
tracemalloc, but has a lower overhead. On Linux, compute the sum of
Private_Clean and Private_Dirty memory mappings of /proc/self/smaps.
On
[Antoine Pitrou ]
>> Ah, interesting. Were you able to measure the memory footprint as well?
[Inada Naoki ]
> Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
> benchmarks. I will look it later.
>
> ```
> $ ./python -m pyperf compare_to pymalloc-mem.json mimalloc-mem.json
On Thu, 4 Jul 2019 23:32:55 +0900
Inada Naoki wrote:
> On Thu, Jul 4, 2019 at 8:09 PM Antoine Pitrou wrote:
> >
> > Ah, interesting. Were you able to measure the memory footprint as well?
> >
>
> Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
> benchmarks. I will
On Thu, Jul 4, 2019 at 8:09 PM Antoine Pitrou wrote:
>
> Ah, interesting. Were you able to measure the memory footprint as well?
>
Hmm, it is not good. mimalloc uses MADV_FREE so it may affect to some
benchmarks. I will look it later.
```
$ ./python -m pyperf compare_to pymalloc-mem.json
On Thu, 4 Jul 2019 16:19:52 +0900
Inada Naoki wrote:
> On Tue, Jun 25, 2019 at 5:49 AM Antoine Pitrou wrote:
> >
> >
> > For the record, there's another contender in the allocator
> > competition now:
> > https://github.com/microsoft/mimalloc/
> >
> > Regards
> >
> > Antoine.
>
> It's a very
On Tue, Jun 25, 2019 at 5:49 AM Antoine Pitrou wrote:
>
>
> For the record, there's another contender in the allocator
> competition now:
> https://github.com/microsoft/mimalloc/
>
> Regards
>
> Antoine.
It's a very strong competitor!
$ ./python -m pyperf compare_to pymalloc.json mimalloc.json
[Antoine Pitrou ]
> For the record, there's another contender in the allocator
> competition now:
> https://github.com/microsoft/mimalloc/
Thanks! From a quick skim, most of it is addressing things obmalloc doesn't:
1) Efficient thread safety (we rely on the GIL).
2) Directly handling requests
For the record, there's another contender in the allocator
competition now:
https://github.com/microsoft/mimalloc/
Regards
Antoine.
On Mon, 24 Jun 2019 00:20:03 -0500
Tim Peters wrote:
> [Tim]
> > The radix tree generally appears to be a little more memory-frugal
> > than my PR (presumably
[Tim]
> The radix tree generally appears to be a little more memory-frugal
> than my PR (presumably because my need to break "big pools" into 4K
> chunks, while the tree branch doesn't, buys the tree more space to
> actually store objects than it costs for the new tree).
It depends a whole lot on
[Thomas]
>>> And what would be an efficient way of detecting allocations punted to
>>> malloc, if not address_in_range?
[Tim]
>> _The_ most efficient way is the one almost all allocators used long
>> ago: use some "hidden" bits right before the address returned to the
>> user to store info about
On Fri, 21 Jun 2019 17:18:18 -0500
Tim Peters wrote:
>
> > And what would be an efficient way of detecting allocations punted to
> > malloc, if not address_in_range?
>
> _The_ most efficient way is the one almost all allocators used long
> ago: use some "hidden" bits right before the address
For those who would like to test with something compatible with
Python 3.7.3, I made re-based branches here:
https://github.com/nascheme/cpython/tree/obmalloc_radix_v37
https://github.com/nascheme/cpython/tree/obmalloc_big_pools_v37
They should be ABI compatible with Python 3.7.3. So,
On 2019-06-21, Tim Peters wrote:
> [Thomas Wouters ]
> > Getting rid of address_in_range sounds like a nice idea, and I
> > would love to test how feasible it is -- I can run such a change
> > against a wide selection of code at work, including a lot of
> > third-party extension modules, but I
Le ven. 21 juin 2019 à 23:19, Thomas Wouters a écrit :
> Is this really feasible in a world where the allocators can be selected (and
> the default changed) at runtime?
The memory allocation must not be changed after the Python
pre-initialization. What's done after pre-initialization is more to
[Tim]
>> I don't think we need to cater anymore to careless code that mixes
>> system memory calls with O calls (e.g., if an extension gets memory
>> via `malloc()`, it's its responsibility to call `free()`), and if not
>> then `address_in_range()` isn't really necessary anymore either, and
>>
On Sun, Jun 2, 2019 at 7:57 AM Tim Peters wrote:
> I don't think we need to cater anymore to careless code that mixes
> system memory calls with O calls (e.g., if an extension gets memory
> via `malloc()`, it's its responsibility to call `free()`), and if not
> then `address_in_range()` isn't
[Neil Schemenauer ]
> I've done a little testing the pool overhead. I have an application
> that uses many small dicts as holders of data. The function:
>
> sys._debugmallocstats()
>
> is useful to get stats for the obmalloc pools. Total data allocated
> by obmalloc is 262 MB. At the
On 2019-06-09, Tim Peters wrote:
> And now there's a PR that removes obmalloc's limit on pool sizes, and,
> for a start, quadruples pool (and arena!) sizes on 64-bit boxes:
Neat.
> As the PR says,
>
> """
> It would be great to get feedback from 64-bit apps that do massive
> amounts of
[Tim\
> For the current obmalloc, I have in mind a different way ...
> Not ideal, but ... captures the important part (more objects
> in a pool -> more times obmalloc can remain in its
> fastest "all within the pool" paths).
And now there's a PR that removes obmalloc's limit on pool sizes, and,
To be clearer, while knowing the size of allocated objects may be of
some use to some other allocators, "not really" for obmalloc. That
one does small objects by itself in a uniform way, and punts
everything else to the system malloc family. The _only_ thing it
wants to know on a free/realloc is
On 2019-06-06, Tim Peters wrote:
> The doubly linked lists in gc primarily support efficient
> _partitioning_ of objects for gc's purposes (a union of disjoint sets,
> with constant-time moving of an object from one set to another, and
> constant-time union of disjoint sets). "All objects" is
On 2019-06-06, Tim Peters wrote:
> Like now: if the size were passed in, obmalloc could test the size
> instead of doing the `address_in_range()` dance(*). But if it's ever
> possible that the size won't be passed in, all the machinery
> supporting `address_in_range()` still needs to be there,
On Thu, 6 Jun 2019 17:26:17 -0500
Tim Peters wrote:
>
> The doubly linked lists in gc primarily support efficient
> _partitioning_ of objects for gc's purposes (a union of disjoint sets,
> with constant-time moving of an object from one set to another, and
> constant-time union of disjoint
[Tim]
>> But I don't know what you mean by "access memory in random order to
>> iterate over known objects". obmalloc never needs to iterate over
>> known objects - indeed, it contains no code capable of doing that..
>> Our cyclic gc does, but that's independent of obmalloc.
[Antoine]
> It's
On Thu, 6 Jun 2019 16:03:03 -0500
Tim Peters wrote:
> But I don't know what you mean by "access memory in random order to
> iterate over known objects". obmalloc never needs to iterate over
> known objects - indeed, it contains no code capable of doing that..
> Our cyclic gc does, but that's
[Antoine Pitrou ]
> But my response was under the assumption that we would want obmalloc to
> deal with all allocations.
I didn't know that. I personally have no interest in that: if we
want an all-purpose allocator, there are several already to choose
from. There's no reason to imagine we
On Thu, 6 Jun 2019 13:57:37 -0500
Tim Peters wrote:
> [Antoine Pitrou ]
> > The interesting thing here is that in many situations, the size is
> > known up front when deallocating - it is simply not communicated to the
> > deallocator because the traditional free() API takes a sole pointer,
> >
[Antoine Pitrou ]
> The interesting thing here is that in many situations, the size is
> known up front when deallocating - it is simply not communicated to the
> deallocator because the traditional free() API takes a sole pointer,
> not a size. But CPython could communicate that size easily if
33 matches
Mail list logo