Hi, The following function is completely reasonable. It shouldn't be hard to implement it (a few lines of C code).
def reset_peak_memory(): # in _tracemalloc.c tracemalloc_peak_trace_memory = tracemalloc_traced_memory; Reset the peak to tracemalloc_traced_memory is correct :-) Can you please open an issue at https://bugs.python.org/ to request the feature? Do you want to implement it? Put me (vstinner) in the nosy list of the issue. I wrote tracemalloc and so could help you to implement the feature ;-) Victor Le jeu. 14 mai 2020 à 15:06, <wilson.h...@gmail.com> a écrit : > > Hi, > > It would be helpful for us if tracemalloc had a function that reset the peak > memory usage counter, without clearing the current traces. At the moment, I > don't think there's a way to find the peak memory of a subset of the code > since the initial tracemalloc.start() call, without calling > tracemalloc.clear_traces(). The latter disturbs other parts of the tracing. > > Specifically, it might be a function like (pseudo-implementation): > > def reset_peak_memory(): > # in _tracemalloc.c > tracemalloc_peak_trace_memory = tracemalloc_traced_memory; > > This would allow easily determining the peak memory usage of a specific piece > of code, without disturbing all of the traces. For example, the following > would set specific_peak to the highest size of traced memory of just line X: > > tracemalloc.start() > # ... code where allocations matter, but the peak does not ... > peak_memory_doesnt_matter() > > tracemalloc.reset_peak_memory() > peak_memory_is_important() # X > _, specific_peak = tracemalloc.get_traced_memory() > > # ... more code with relevant allocations ... > peak_memory_doesnt_matter() > > tracemalloc.stop() > > As sketched above, the implementation of this should be quite small, with the > core being the line mentioned above, plus all the required extras (locking, > wrapping, documentation, tests, ...). Thoughts? > > > Full motivation for why we want to do this: > > In <https://github.com/stellargraph/stellargraph>, we're using the > tracemalloc module to understand the memory usage of our core StellarGraph > graph class (a nodes-and-edges graph, not a plot, to be clear). It stores > some NumPy arrays of feature vectors associated with each node in the graph, > along with all of the edge information. Any of these pieces can be large, and > we want to keep the resource usage as small as possible. We're monitoring > this by instrumenting the construction: start from a raw set of nodes > (including potentially large amounts of features) and edges, and build a > StellarGraph object, recording some metrics: > > 1. the time > 2. the total memory usage of the graph instance > 3. the additional memory usage, that's not shared with the raw data (in > particular, if the raw data is 1GB, it's useful to know whether a 1.5GB graph > instance consists of 0.5GB of new memory, or 1.5GB of new memory) > 4. the peak memory usage during construction > > 2, 3 and 4 we record using a combination of tracemalloc.take_snapshot() and > tracemalloc.get_traced_memory(), something like: > > def diff(after, before): return sum(elem.size_diff for > after.compare_to(before, "lineno")) > > snap_start = take_snapshot() > > raw = load_data_from_disk() > snap_raw = take_snapshot() > > # X > > graph = create_graph(raw) > snap_raw_graph = take_snapshot() > _, mem_peak = get_traced_memory() # 4 > > del raw > snap_graph = take_snapshot() > > mem_raw = diff(snap_raw, snap_start) # baseline > mem_graph = diff(snap_graph, snap_start) # 2 > mem_graph_not_shared = diff(snap_raw_graph, snap_raw) # 3 > > ('measure_memory' in > <https://nbviewer.jupyter.org/github/stellargraph/stellargraph/blob/93fce46166645dd0d1ca2ea2862b68355826e3fc/demos/zzz-internal-developers/graph-resource-usage.ipynb#Measurement> > has all the gory details.) > > Unfortunately, we want to ignore any peak during data loading: the peak > during create_graph is all we care about, even if the overall peak (in data > loading) is higher. That is, we want to only consider the peak memory usage > after line X. One way to do this would be to call clear_traces() at X, but > this invalidates the traces used for the 2 and 3 comparisons. I believe > tracemalloc.reset_peak_memory() is the necessary function to call at X. (Why > do we want to ignore the peak during data loading? The loading is under the > control of a user (of stellargraph) as it's typically done via Pandas or > NumPy and those libraries are out of our control and offer a variety of > options for tweaking data-loading behavior, whereas the internals of the > `StellarGraph` instance are in our control and not as configurable by users.) > > Thanks, > Huon Wilson > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/QDWI37A4TJXOYUKULGPY2GKD7IG2JNDC/ > Code of Conduct: http://python.org/psf/codeofconduct/ -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/T5XIRL4HTW57KM4RWHR67KJTHYF76U2D/ Code of Conduct: http://python.org/psf/codeofconduct/