Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
2013/10/29 Victor Stinner victor.stin...@gmail.com: 2013/10/29 Kristján Valur Jónsson krist...@ccpgames.com: I was thinking something similar. It would be useful to be able to pause and resume if one is doing any analysis work in the live environment. This would reduce the need to have Filter objects. Internally, tracemalloc uses a thread-local variable (called the reentrant flag) to disable temporarly tracing allocations in the current thread. It only disables tracing new allocations, deallocations are still proceed. If I give access to this flag, it would be possible to disable temporarily tracing in the current thread, but tracing would still be enabled in other threads. Would it fit your requirement? Example: --- tracemalloc.enable() # start your application ... # spawn many threads ... # oh no, I don't want to trace this ugly function tracemalloc.disable_local() ugly_function() tracemalloc.enable_local() ... snapshot = take_snapshot() --- You can imagine a context manager based on these two functions: --- with disable_tracing_temporarily_in_current_thread(): ugly_function() --- I still don't understand why you would need to stop tracing temporarily. When I use tracemalloc, I never disable it. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
2013/10/31 Victor Stinner victor.stin...@gmail.com: If I give access to this flag, it would be possible to disable temporarily tracing in the current thread, but tracing would still be enabled in other threads. Would it fit your requirement? It's probably not what you are looking for :-) As I wrote in the PEP, the API of tracemalloc was inspired by the faulthandler module. enable() / disable() makes sense in faulthandler because faulthandler is passive: it only do something on a trigger (synchonous signals like SIGFPE or SIGSEGV). I realized that tracemalloc is different: as written in the documentation, enable() *starts* tracing. After enable() has been called, tracemalloc becomes active. So tracemalloc should use names start() / stop() rather than enable() / disable(). I did another experiment. I replaced enable/disable/is_enabled with start/stop/is_tracing, and added enable/disable/is_enabled functions to disable temporarily tracing. API: - clear_traces(): clear traces - start(): start tracing (the old enable) - stop(): stop tracing and clear traces (the old disable) - disable(): disable temporarily tracing - enable(): reenable tracing - is_tracing(): True if tracemalloc is tracing, False otherwise (the old is_enabled) - is_enabled(): True if tracemalloc is enabled, False otherwise All these functions are process-wide (affect all threads). tracemalloc is only tracing new allocations if is_tracing() and is_enabled() are True. If is_tracing() is True and is_enabled() is False, deallocations still remove traces (otherwise, the internal dictionary of traces would become inconsistent). Example: --- tracemalloc.start() # start your application ... useful = UsefulObject() huge = HugeObject() ... snapshot1 = take_snapshot() ... # oh no, I don't want to trace this ugly object, but please don't trash old traces tracemalloc.disable() ugly = ugly_object() ... # release memory of the huge object huge = None ... # restart tracing (ugly is still alive) tracemalloc.enable() ... snapshot2 = take_snapshot() tracemalloc.stop() --- snapshot1 contains traces of objects: - useful - huge snapshot2 contains traces of objects: - useful huge is missing from snapshot2 even if the module was disabled. ugly is missing from snapshot2 because tracing was disabled. Does it look better? I don't see the usecase of disable() / enable() yet, but it's cheap (it just add a flag). Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
On 10/31/2013 05:20 AM, Victor Stinner wrote: I did another experiment. I replaced enable/disable/is_enabled with start/stop/is_tracing, and added enable/disable/is_enabled functions to disable temporarily tracing. API: - clear_traces(): clear traces - start(): start tracing (the old enable) - stop(): stop tracing and clear traces (the old disable) - disable(): disable temporarily tracing - enable(): reenable tracing - is_tracing(): True if tracemalloc is tracing, False otherwise (the old is_enabled) - is_enabled(): True if tracemalloc is enabled, False otherwise These names make more sense. However, `stop` is still misleading as it both stops and destroys data. An easy fix for that is for stop to save the data somewhere so get_traces (or whatever) can still retrieve it. If `stop` really must destroy the data, perhaps it should be called `close` instead; StringIO has a similar close method that when called destroys any stored data, and get_value must be called first if that data is wanted. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
Hi, 2013/10/30 Jim J. Jewett jimjjew...@gmail.com: Well, unless I missed it... I don't see how to get anything beyond the return value of get_traces, which is a (time-ordered?) list of allocation size with then-current call stack. It doesn't mention any attribute for indicating that some entries are de-allocations, let alone the actual address of each allocation. get_traces() does return the traces of the currently allocated memory blocks. It's not a log of alloc/dealloc calls. The list is not sorted. If you want a sorted list, use take_snapshot.statistics('lineno') for example. In that case, I would expect disabling (and filtering) to stop capturing new allocation events for me, but I would still expect tracemalloc to do proper internal maintenance. tracemalloc has an important overhead in term of performances and memory. The purpose of disable() is to... disable the module, to remove complelty the overhead. In practice, enable() installs on memory allocators, disable() uninstalls these hooks. I don't understand why you are so concerned by disable(). Why would you like to keep traces and disable the module? I never called disable() in my own tests, the module is automatically disabled at exit. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
2013/10/30 Stephen J. Turnbull step...@xemacs.org: Just reset implies to me that you're ready to start over. Not just traced memory blocks but accumulated statistics and any configuration (such as Filters) would also be reset. Also tracing would be disabled until started explicitly. If the name is really the problem, I propose the restore the previous name: clear_traces(). It's symmetric with get_traces(), like add_filter()/get_filters()/clear_filters(). Shouldn't disable() do this automatically, perhaps with an optional discard_traces flag (which would be False by default)? The pattern is something like that: enable() snapshot1 = take_snapshot() ... snapshot2 = take_snapshot() disable() I don't see why disable() would return data. But I definitely agree with Jim: You *must* provide an example here showing how to save the traces (even though it's trivial to do so), because that will make clear that disable() is a destructive operation. (It is not destructive in any other debugging tool that I've used.) Even with documentation, be prepared for user complaints. I added Call get_traces() or take_snapshot() function to get traces before clearing them. to the doc: http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html#tracemalloc.disable Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
On Wed, Oct 30, 2013 at 6:02 AM, Victor Stinner victor.stin...@gmail.com wrote: 2013/10/30 Jim J. Jewett jimjjew...@gmail.com: Well, unless I missed it... I don't see how to get anything beyond the return value of get_traces, which is a (time-ordered?) list of allocation size with then-current call stack. It doesn't mention any attribute for indicating that some entries are de-allocations, let alone the actual address of each allocation. get_traces() does return the traces of the currently allocated memory blocks. It's not a log of alloc/dealloc calls. The list is not sorted. If you want a sorted list, use take_snapshot.statistics('lineno') for example. Any list is sorted somehow; I had assumed that it was defaulting to order-of-creation, though if you use a dict internally, that might not be the case. If you return it as a list instead of a dict, but that list is NOT in time-order, that is worth documenting Also, am I misreading the documentation of get_traces() function? Get traces of memory blocks allocated by Python. Return a list of (size: int, traceback: tuple) tuples. traceback is a tuple of (filename: str, lineno: int) tuples. So it now sounds like you don't bother to emit de-allocation events because you just remove the allocation from your internal data structure. In other words, you provide a snapshot, but not a history -- except that the snapshot isn't complete either, because it only shows things that appeared after a certain event (the most recent enablement). I still don't see anything here(*) that requires even saving the address, let alone preventing re-use. (*) get_object_traceback(obj) might require a stored address for efficiency, but the base functionality of getting traces doesn't. I still wouldn't worry about address re-use though, because the address should not be re-used until the object has been deleted -- and is no longer available to be passed to get_object_traceback. So the worst that can happen is that an object which was not traced might return a bogus answer instead of failing. In that case, I would expect disabling (and filtering) to stop capturing new allocation events for me, but I would still expect tracemalloc to do proper internal maintenance. tracemalloc has an important overhead in term of performances and memory. The purpose of disable() is to... disable the module, to remove completely the overhead. ... Why would you like to keep traces and disable the module? Because of that very overhead. I think my use typical use case would be similar to Kristján Valur's, but I'll try to spell it out in more detail here. (1) Whoa -- memory hog! How can I fix this? (2) I know -- track all allocations, with a traceback showing why they were made. (At a minimum, I would like to be able to subclass your tool to do this -- preferably without also keeping the full history in memory.) (3) Oh, maybe I should skip the ones that really are temporary and get cleaned up. (You make this easy by handling the de-allocs, though I'm not sure those events get exposed to anyone working at the python level, as opposed to modifying and re-compiling.) (4) hmm... still too big ... I should use filters. (But will changing those filters while tracing is enabled mess up your current implementation?) (5) Argh. What I really want is to know what gets allocated at times like XXX. I can do that if times-like-XXX only ever occur once per process. I *might* be able to do it with filters. But I would rather do it by saying trace on and trace off. Maybe even with a context manager around the suspicious places. (6) Then, at the end of the run, I would say give me the info about how much was allocated when tracing was on. Some of that might be going away again when tracing is off, but at least I know what is making the allocations in the first place. And I know that they're sticking around long enough. Under your current proposal, step (5) turns into set filters trace on ... get_traces serialize to some other storage trace off and step (6) turns into read in from that other storage I just made up on the fly, and do my own summarizing, because my format is almost by definition non-standard. This complication isn't intolerable, but neither is it what I expect from python. And it certainly isn't what I expect from a binary toggle like enable/disable. (So yes, changing the name to clear_traces would help, because I would still be disappointed, but at least I wouldn't be surprised.) Also, if you do stick with the current limitations, then why even have get_traces, as opposed to just take_snapshot? Is there some difference between them, except that a snapshot has some convenience methods and some simple metadata? Later, he wrote: I don't see why disable() would return data. disable is indeed a bad name for something that returns data. The only reason to return data from
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
Le 30 oct. 2013 20:58, Jim Jewett jimjjew...@gmail.com a écrit : hough if you use a dict internally, that might not be the case. Tracemalloc uses a {address: trace} duct internally. If you return it as a list instead of a dict, but that list is NOT in time-order, that is worth documenting Ok i will document it. Also, am I misreading the documentation of get_traces() function? Get traces of memory blocks allocated by Python. Return a list of (size: int, traceback: tuple) tuples. traceback is a tuple of (filename: str, lineno: int) tuples. So it now sounds like you don't bother to emit de-allocation events because you just remove the allocation from your internal data structure. I don't understand your question. Tracemalloc does not store events but traces. When a memory block is deallocated, it us removed from the internal dict (and so from get_traces() list). I still don't see anything here(*) that requires even saving the address, let alone preventing re-use. The address must be stored internally to maintain the internal dict. See the C code. (1) Whoa -- memory hog! How can I fix this? (2) I know -- track allocallocations, with a traceback showing why they were made. (At a minimum, I would like to be able to subclass your tool to do this -- preferably without also keeping the full history in memory.) What do you mean by full history and subclass your tool? (3) Oh, maybe I should skip the ones that really are temporary and get cleaned up. (You make this easy by handling the de-allocs, though I'm not sure those events get exposed to anyone working at the python level, as opposed to modifying and re-compiling.) If your temporary objects are destroyed before you call get_traces(), you will not see them in get_traces(). I don't understand. (4) hmm... still too big ... I should use filters. (But will changing those filters while tracing is enabled mess up your current implementation?) If you call add_filter(), new traces() will be filtered. Not the old ones, as explained in the doc. What do you mean by mess up? (5) Argh. What I really want is to know what gets allocated at times like XXX. I can do that if times-like-XXX only ever occur once per process. I *might* be able to do it with filters. But I would rather do it by saying trace on and trace off. Maybe even with a context manager around the suspicious places. I don't understand times like XXX, what is it? To see what happened between two lines of code, you can compare two snapshots. No need to disable tracing. (6) Then, at the end of the run, I would say give me the info about how much was allocated when tracing was on. Some of that might be going away again when tracing is off, but at least I know what is making the allocations in the first place. And I know that they're sticking around long enough. I think you musunderstood how tracemalloc works. You should compile it and play with it. In my opinion, you already have everything in tracemalloc for you scenario. Under your current proposal, step (5) turns into set filters trace on ... get_traces serialize to some other storage trace off s1=take_snapshot() ... s2=take_snapshot() ... diff=s2.statistics(lines, compare_to=s1) why even have get_traces, as opposed to just take_snapshot? Is there some difference between them, except that a snapshot has some convenience methods and some simple metadata? See the doc: Snapshot.traces is the result of get_traces(). get_traces() is here is you want to write your own tool without Snapshot. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
Jim Jewett writes: Later, he wrote: I don't see why disable() would return data. disable is indeed a bad name for something that returns data. Note that I never proposed that disable() *return* anything, only that it *get* the trace. It could store it in some specified object, or a file, rather than return it, for example. I deliberately left what it does with the retrieved data unspecified. The important thing to me is that it not be dropped on the floor by something named disable. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
A disable() function: Stop tracing Python memory allocations and clear traces of memory blocks allocated by Python. I would disable to stop tracing, but I would not expect it to clear out the traces it had already captured. If it has to do that, please put in some sample code showing how to save the current traces before disabling. I was thinking something similar. It would be useful to be able to pause and resume if one is doing any analysis work in the live environment. This would reduce the need to have Filter objects. K ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
2013/10/29 Jim Jewett jimjjew...@gmail.com: reset() function: Clear traces of memory blocks allocated by Python. Does this do anything besides clear? If not, why not just re-use the 'clear' name from dicts? (I like the reset() name. Charles-François suggested this name inspired by OProfile API.) disable() function: Stop tracing Python memory allocations and clear traces of memory blocks allocated by Python. I would disable to stop tracing, but I would not expect it to clear out the traces it had already captured. If it has to do that, please put in some sample code showing how to save the current traces before disabling. For consistency, you cannot keep traces when tracing is disabled. The free() must be enabled to remove allocated memory blocks, or next malloc() may get the same address which would raise an assertion error (you cannot have two memory blocks at the same address). Just call get_traces() to get traces before clearing them. I can explain it in the doc. 2013/10/29 Kristján Valur Jónsson krist...@ccpgames.com: I was thinking something similar. It would be useful to be able to pause and resume if one is doing any analysis work in the live environment. This would reduce the need to have Filter objects. For the reason explained above, it's not possible to disable the whole module temporarly. Internally, tracemalloc uses a thread-local variable (called the reentrant flag) to disable temporarly tracing allocations in the current thread. It only disables tracing new allocations, deallocations are still proceed. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 454 (tracemalloc) disable == clear?
(Tue Oct 29 12:37:52 CET 2013) Victor Stinner wrote: For consistency, you cannot keep traces when tracing is disabled. The free() must be enabled to remove allocated memory blocks, or next malloc() may get the same address which would raise an assertion error (you cannot have two memory blocks at the same address). That seems like an a quirk of the implementation, particularly since the actual address is not returned to the user. Nor do I see any way of knowing when that allocation is freed. Well, unless I missed it... I don't see how to get anything beyond the return value of get_traces, which is a (time-ordered?) list of allocation size with then-current call stack. It doesn't mention any attribute for indicating that some entries are de-allocations, let alone the actual address of each allocation. For the reason explained above, it's not possible to disable the whole module temporarly. Internally, tracemalloc uses a thread-local variable (called the reentrant flag) to disable temporarly tracing allocations in the current thread. It only disables tracing new allocations, deallocations are still proceed. Even assuming the restriction is needed, this just seems to mean that disabling (or filtering) should not affect de-allocation events, for fear of corrupting tracemalloc's internal structures. In that case, I would expect disabling (and filtering) to stop capturing new allocation events for me, but I would still expect tracemalloc to do proper internal maintenance. It would at least explain why you need both disable *and* reset; reset would empty those internal structures, so that tracemalloc could shortcut that maintenance. I would NOT assume that I needed to call reset when changing the filters, nor would I assume that changing them threw out existing traces. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?
Victor Stinner writes: 2013/10/29 Jim Jewett jimjjew...@gmail.com: reset() function: Clear traces of memory blocks allocated by Python. Does this do anything besides clear? If not, why not just re-use the 'clear' name from dicts? (I like the reset() name. Charles-François suggested this name inspired by OProfile API.) Just reset implies to me that you're ready to start over. Not just traced memory blocks but accumulated statistics and any configuration (such as Filters) would also be reset. Also tracing would be disabled until started explicitly. If you want it to apply just to the traces, reset_traces() would be more appropriate. disable() function: Stop tracing Python memory allocations and clear traces of memory blocks allocated by Python. I would disable to stop tracing, but I would not expect it to clear out the traces it had already captured. If it has to do that, please put in some sample code showing how to save the current traces before disabling. For consistency, you cannot keep traces when tracing is disabled. The free() must be enabled to remove allocated memory blocks, or next malloc() may get the same address which would raise an assertion error (you cannot have two memory blocks at the same address). Then I would not call this disable. disable() should not destroy data. Just call get_traces() to get traces before clearing them. I can explain it in the doc. Shouldn't disable() do this automatically, perhaps with an optional discard_traces flag (which would be False by default)? But I definitely agree with Jim: You *must* provide an example here showing how to save the traces (even though it's trivial to do so), because that will make clear that disable() is a destructive operation. (It is not destructive in any other debugging tool that I've used.) Even with documentation, be prepared for user complaints. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 454 (tracemalloc) disable == clear?
reset() function: Clear traces of memory blocks allocated by Python. Does this do anything besides clear? If not, why not just re-use the 'clear' name from dicts? disable() function: Stop tracing Python memory allocations and clear traces of memory blocks allocated by Python. I would disable to stop tracing, but I would not expect it to clear out the traces it had already captured. If it has to do that, please put in some sample code showing how to save the current traces before disabling. -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com