Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-31 Thread Victor Stinner
2013/10/29 Victor Stinner victor.stin...@gmail.com:
 2013/10/29 Kristján Valur Jónsson krist...@ccpgames.com:
 I was thinking something similar.  It would be useful to be able to pause 
 and resume
 if one is doing any analysis work in the live environment.  This would 
 reduce the
 need to have Filter objects.

 Internally, tracemalloc uses a thread-local variable (called the
 reentrant flag) to disable temporarly tracing allocations in the
 current thread. It only disables tracing new allocations,
 deallocations are still proceed.

If I give access to this flag, it would be possible to disable
temporarily tracing in the current thread, but tracing would still be
enabled in other threads. Would it fit your requirement?

Example:
---
tracemalloc.enable()
# start your application
...
# spawn many threads
...
# oh no, I don't want to trace this ugly function
tracemalloc.disable_local()
ugly_function()
tracemalloc.enable_local()
...
snapshot = take_snapshot()
---

You can imagine a context manager based on these two functions:
---
with disable_tracing_temporarily_in_current_thread():
  ugly_function()
---

I still don't understand why you would need to stop tracing
temporarily. When I use tracemalloc, I never disable it.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-31 Thread Victor Stinner
2013/10/31 Victor Stinner victor.stin...@gmail.com:
 If I give access to this flag, it would be possible to disable
 temporarily tracing in the current thread, but tracing would still be
 enabled in other threads. Would it fit your requirement?

It's probably not what you are looking for :-)

As I wrote in the PEP, the API of tracemalloc was inspired by the
faulthandler module. enable() / disable() makes sense in faulthandler
because faulthandler is passive: it only do something on a trigger
(synchonous signals like SIGFPE or SIGSEGV). I realized that
tracemalloc is different: as written in the documentation, enable()
*starts* tracing. After enable() has been called, tracemalloc becomes
active. So tracemalloc should use names start() / stop() rather than
enable() / disable().

I did another experiment. I replaced enable/disable/is_enabled with
start/stop/is_tracing, and added enable/disable/is_enabled functions
to disable temporarily tracing.

API:

- clear_traces(): clear traces
- start(): start tracing (the old enable)
- stop(): stop tracing and clear traces (the old disable)
- disable(): disable temporarily tracing
- enable(): reenable tracing
- is_tracing(): True if tracemalloc is tracing, False otherwise (the
old is_enabled)
- is_enabled(): True if tracemalloc is enabled, False otherwise

All these functions are process-wide (affect all threads).

tracemalloc is only tracing new allocations if is_tracing() and
is_enabled() are True.

If is_tracing() is True and is_enabled() is False, deallocations still
remove traces (otherwise, the internal dictionary of traces would
become inconsistent).

Example:
---
tracemalloc.start()
# start your application
...
useful = UsefulObject()
huge = HugeObject()
...
snapshot1 = take_snapshot()
...
# oh no, I don't want to trace this ugly object, but please don't
trash old traces
tracemalloc.disable()
ugly = ugly_object()
...
# release memory of the huge object
huge = None
...
# restart tracing (ugly is still alive)
tracemalloc.enable()
...
snapshot2 = take_snapshot()
tracemalloc.stop()
---

snapshot1 contains traces of objects:
- useful
- huge

snapshot2 contains traces of objects:
- useful

huge is missing from snapshot2 even if the module was disabled. ugly
is missing from snapshot2 because tracing was disabled.

Does it look better? I don't see the usecase of disable() / enable()
yet, but it's cheap (it just add a flag).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-31 Thread Ethan Furman

On 10/31/2013 05:20 AM, Victor Stinner wrote:

I did another experiment. I replaced enable/disable/is_enabled with
start/stop/is_tracing, and added enable/disable/is_enabled functions
to disable temporarily tracing.

API:

- clear_traces(): clear traces
- start(): start tracing (the old enable)
- stop(): stop tracing and clear traces (the old disable)
- disable(): disable temporarily tracing
- enable(): reenable tracing
- is_tracing(): True if tracemalloc is tracing, False otherwise (the
old is_enabled)
- is_enabled(): True if tracemalloc is enabled, False otherwise


These names make more sense.  However, `stop` is still misleading as it both stops and destroys data.  An easy fix for 
that is for stop to save the data somewhere so get_traces (or whatever) can still retrieve it.


If `stop` really must destroy the data, perhaps it should be called `close` instead; StringIO has a similar close method 
that when called destroys any stored data, and get_value must be called first if that data is wanted.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-30 Thread Victor Stinner
Hi,

2013/10/30 Jim J. Jewett jimjjew...@gmail.com:
 Well, unless I missed it... I don't see how to get anything beyond
 the return value of get_traces, which is a (time-ordered?) list
 of allocation size with then-current call stack.  It doesn't mention
 any attribute for indicating that some entries are de-allocations,
 let alone the actual address of each allocation.

get_traces() does return the traces of the currently allocated memory
blocks. It's not a log of alloc/dealloc calls. The list is not sorted.
If you want a sorted list, use take_snapshot.statistics('lineno') for
example.

 In that case, I would expect disabling (and filtering) to stop
 capturing new allocation events for me, but I would still expect
 tracemalloc to do proper internal maintenance.

tracemalloc has an important overhead in term of performances and
memory. The purpose of disable() is to... disable the module, to
remove complelty the overhead.

In practice, enable() installs on memory allocators, disable()
uninstalls these hooks.

I don't understand why you are so concerned by disable(). Why would
you like to keep traces and disable the module? I never called
disable() in my own tests, the module is automatically disabled at
exit.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-30 Thread Victor Stinner
2013/10/30 Stephen J. Turnbull step...@xemacs.org:
 Just reset implies to me that you're ready to start over.  Not just
 traced memory blocks but accumulated statistics and any configuration
 (such as Filters) would also be reset.  Also tracing would be disabled
 until started explicitly.

If the name is really the problem, I propose the restore the previous
name: clear_traces(). It's symmetric with get_traces(), like
add_filter()/get_filters()/clear_filters().


 Shouldn't disable() do this automatically, perhaps with an optional
 discard_traces flag (which would be False by default)?

The pattern is something like that:

enable()
snapshot1 = take_snapshot()
...
snapshot2 = take_snapshot()
disable()

I don't see why disable() would return data.


 But I definitely agree with Jim:  You *must* provide an example here
 showing how to save the traces (even though it's trivial to do so),
 because that will make clear that disable() is a destructive
 operation.  (It is not destructive in any other debugging tool that
 I've used.)  Even with documentation, be prepared for user complaints.

I added Call get_traces() or take_snapshot() function to get traces
before clearing them. to the doc:

http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html#tracemalloc.disable

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-30 Thread Jim Jewett
On Wed, Oct 30, 2013 at 6:02 AM, Victor Stinner
victor.stin...@gmail.com wrote:
 2013/10/30 Jim J. Jewett jimjjew...@gmail.com:
 Well, unless I missed it... I don't see how to get anything beyond
 the return value of get_traces, which is a (time-ordered?) list
 of allocation size with then-current call stack.  It doesn't mention
 any attribute for indicating that some entries are de-allocations,
 let alone the actual address of each allocation.


 get_traces() does return the traces of the currently allocated memory
 blocks. It's not a log of alloc/dealloc calls. The list is not sorted.
 If you want a sorted list, use take_snapshot.statistics('lineno') for
 example.

Any list is sorted somehow; I had assumed that it was defaulting to
order-of-creation, though if you use a dict internally, that might not
be the case.  If you return it as a list instead of a dict, but that list is
NOT in time-order, that is worth documenting

Also, am I misreading the documentation of get_traces() function?

Get traces of memory blocks allocated by Python.
Return a list of (size: int, traceback: tuple) tuples.
traceback is a tuple of (filename: str, lineno: int) tuples.


So it now sounds like you don't bother to emit de-allocation
events because you just remove the allocation from your
internal data structure.

In other words, you provide a snapshot, but not a history --
except that the snapshot isn't complete either, because it
only shows things that appeared after a certain event
(the most recent enablement).

I still don't see anything here(*) that requires even saving
the address, let alone preventing re-use.

(*) get_object_traceback(obj) might require a stored
 address for efficiency, but the base functionality of
getting traces doesn't.

I still wouldn't worry about address re-use though,
because the address should not be re-used until
the object has been deleted -- and is no longer
available to be passed to get_object_traceback.
So the worst that can happen is that an object which
was not traced might return a bogus answer
instead of failing.

 In that case, I would expect disabling (and filtering) to stop
 capturing new allocation events for me, but I would still expect
 tracemalloc to do proper internal maintenance.

 tracemalloc has an important overhead in term of performances and
 memory. The purpose of disable() is to... disable the module, to
 remove completely the overhead.
 ...  Why would you like to keep traces and disable the module?

Because of that very overhead.  I think my use typical use case would
be similar to Kristján Valur's, but I'll try to spell it out in more
detail here.

(1)  Whoa -- memory hog!  How can I fix this?

(2)  I know -- track all allocations, with a traceback showing why they
were made.  (At a minimum, I would like to be able to subclass your
tool to do this -- preferably without also keeping the full history in
memory.)

(3)  Oh, maybe I should skip the ones that really are temporary and
get cleaned up.  (You make this easy by handling the de-allocs,
though I'm not sure those events get exposed to anyone working at
the python level, as opposed to modifying and re-compiling.)

(4)  hmm... still too big ... I should use filters.  (But will changing those
filters while tracing is enabled mess up your current implementation?)

(5)  Argh.  What I really want is to know what gets allocated at times
like XXX.
I can do that if times-like-XXX only ever occur once per process.  I *might* be
able to do it with filters.  But I would rather do it by saying trace on and
trace off.   Maybe even with a context manager around the suspicious
places.

(6)  Then, at the end of the run, I would say give me the info about how much
was allocated when tracing was on.  Some of that might be going away
again when tracing is off, but at least I know what is making the allocations
in the first place.  And I know that they're sticking around long enough.

Under your current proposal, step (5) turns into

set filters
trace on
...
get_traces
serialize to some other storage
trace off

 and step (6) turns into
read in from that other storage I just made up on the fly, and do my own
summarizing, because my format is almost by definition non-standard.

This complication isn't intolerable, but neither is it what I expect
from python.
And it certainly isn't what I expect from a binary toggle like enable/disable.
(So yes, changing the name to clear_traces would help, because I would
still be disappointed, but at least I wouldn't be surprised.)

Also, if you do stick with the current limitations, then why even have
get_traces,
as opposed to just take_snapshot?  Is there some difference between them,
except that a snapshot has some convenience methods and some simple
metadata?

Later, he wrote:
 I don't see why disable() would return data.

disable is indeed a bad name for something that returns data.

The only reason to return data from 

Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-30 Thread Victor Stinner
Le 30 oct. 2013 20:58, Jim Jewett jimjjew...@gmail.com a écrit :
 hough if you use a dict internally, that might not
 be the case.

Tracemalloc uses a {address: trace} duct internally.

  If you return it as a list instead of a dict, but that list is
 NOT in time-order, that is worth documenting

Ok i will document it.

 Also, am I misreading the documentation of get_traces() function?

 Get traces of memory blocks allocated by Python.
 Return a list of (size: int, traceback: tuple) tuples.
 traceback is a tuple of (filename: str, lineno: int) tuples.


 So it now sounds like you don't bother to emit de-allocation
 events because you just remove the allocation from your
 internal data structure.

I don't understand your question. Tracemalloc does not store events but
traces. When a memory block is deallocated, it us removed from the internal
dict (and so from get_traces() list).

 I still don't see anything here(*) that requires even saving
 the address, let alone preventing re-use.

The address must be stored internally to maintain the internal dict. See
the C code.

 (1)  Whoa -- memory hog!  How can I fix this?

 (2)  I know -- track allocallocations, with a traceback showing why they
 were made.  (At a minimum, I would like to be able to subclass your
 tool to do this -- preferably without also keeping the full history in
 memory.)

What do you mean by full history and subclass your tool?

 (3)  Oh, maybe I should skip the ones that really are temporary and
 get cleaned up.  (You make this easy by handling the de-allocs,
 though I'm not sure those events get exposed to anyone working at
 the python level, as opposed to modifying and re-compiling.)

If your temporary objects are destroyed before you call get_traces(), you
will not see them in get_traces(). I don't understand.

 (4)  hmm... still too big ... I should use filters.  (But will changing
those
 filters while tracing is enabled mess up your current implementation?)

If you call add_filter(), new traces() will be filtered. Not the old ones,
as explained in the doc. What do you mean by mess up?

 (5)  Argh.  What I really want is to know what gets allocated at times
 like XXX.
 I can do that if times-like-XXX only ever occur once per process.  I
*might* be
 able to do it with filters.  But I would rather do it by saying trace
on and
 trace off.   Maybe even with a context manager around the suspicious
 places.

I don't understand times like XXX, what is it?

To see what happened between two lines of code, you can compare two
snapshots. No need to disable tracing.

 (6)  Then, at the end of the run, I would say give me the info about how
much
 was allocated when tracing was on.  Some of that might be going away
 again when tracing is off, but at least I know what is making the
allocations
 in the first place.  And I know that they're sticking around long
enough.

I think you musunderstood how tracemalloc works. You should compile it and
play with it. In my opinion, you already have everything in tracemalloc for
you scenario.

 Under your current proposal, step (5) turns into

 set filters
 trace on
 ...
 get_traces
 serialize to some other storage
 trace off

s1=take_snapshot()
...
s2=take_snapshot()
...
diff=s2.statistics(lines, compare_to=s1)

 why even have
 get_traces,
 as opposed to just take_snapshot?  Is there some difference between them,
 except that a snapshot has some convenience methods and some simple
 metadata?

See the doc: Snapshot.traces is the result of get_traces().

get_traces() is here is you want to write your own tool without Snapshot.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-30 Thread Stephen J. Turnbull
Jim Jewett writes:

  Later, he wrote:
   I don't see why disable() would return data.
  
  disable is indeed a bad name for something that returns data.

Note that I never proposed that disable() *return* anything, only that
it *get* the trace.  It could store it in some specified object, or a
file, rather than return it, for example.  I deliberately left what it
does with the retrieved data unspecified.  The important thing to me
is that it not be dropped on the floor by something named disable.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-29 Thread Kristján Valur Jónsson
A
 
 
 disable() function:
 
 Stop tracing Python memory allocations and clear traces of
 memory blocks allocated by Python.
 
 I would disable to stop tracing, but I would not expect it to clear out the
 traces it had already captured.  If it has to do that, please put in some 
 sample
 code showing how to save the current traces before disabling.

I was thinking something similar.  It would be useful to be able to pause and 
resume
if one is doing any analysis work in the live environment.  This would reduce 
the
need to have Filter objects. 

K

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-29 Thread Victor Stinner
2013/10/29 Jim Jewett jimjjew...@gmail.com:
 reset() function:

 Clear traces of memory blocks allocated by Python.

 Does this do anything besides clear?  If not, why not just re-use the
 'clear' name from dicts?

(I like the reset() name. Charles-François suggested this name
inspired by OProfile API.)

 disable() function:

 Stop tracing Python memory allocations and clear traces of
 memory blocks allocated by Python.

 I would disable to stop tracing, but I would not expect it to clear
 out the traces it had already captured.  If it has to do that, please
 put in some sample code showing how to save the current traces before
 disabling.

For consistency, you cannot keep traces when tracing is disabled. The
free() must be enabled to remove allocated memory blocks, or next
malloc() may get the same address which would raise an assertion error
(you cannot have two memory blocks at the same address).

Just call get_traces() to get traces before clearing them. I can
explain it in the doc.

2013/10/29 Kristján Valur Jónsson krist...@ccpgames.com:
 I was thinking something similar.  It would be useful to be able to pause 
 and resume
 if one is doing any analysis work in the live environment.  This would reduce 
 the
 need to have Filter objects.

For the reason explained above, it's not possible to disable the whole
module temporarly.

Internally, tracemalloc uses a thread-local variable (called the
reentrant flag) to disable temporarly tracing allocations in the
current thread. It only disables tracing new allocations,
deallocations are still proceed.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-29 Thread Jim J. Jewett

 
(Tue Oct 29 12:37:52 CET 2013) Victor Stinner wrote:

 For consistency, you cannot keep traces when tracing is disabled.
 The free() must be enabled to remove allocated memory blocks, or
 next malloc() may get the same address which would raise an assertion
 error (you cannot have two memory blocks at the same address).

That seems like an a quirk of the implementation, particularly since
the actual address is not returned to the user.  Nor do I see any way
of knowing when that allocation is freed.

Well, unless I missed it... I don't see how to get anything beyond
the return value of get_traces, which is a (time-ordered?) list 
of allocation size with then-current call stack.  It doesn't mention
any attribute for indicating that some entries are de-allocations,
let alone the actual address of each allocation.

 For the reason explained above, it's not possible to disable the whole
 module temporarly.

 Internally, tracemalloc uses a thread-local variable (called the
 reentrant flag) to disable temporarly tracing allocations in the
 current thread. It only disables tracing new allocations,
 deallocations are still proceed.

Even assuming the restriction is needed, this just seems to mean that
disabling (or filtering) should not affect de-allocation events, for
fear of corrupting tracemalloc's internal structures.

In that case, I would expect disabling (and filtering) to stop
capturing new allocation events for me, but I would still expect
tracemalloc to do proper internal maintenance.

It would at least explain why you need both disable *and* reset;
reset would empty those internal structures, so that tracemalloc
could shortcut that maintenance.  I would NOT assume that I needed
to call reset when changing the filters, nor would I assume that
changing them threw out existing traces.

-jJ

-- 

If there are still threading problems with my replies, please 
email me with details, so that I can try to resolve them.  -jJ

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-29 Thread Stephen J. Turnbull
Victor Stinner writes:

  2013/10/29 Jim Jewett jimjjew...@gmail.com:
   reset() function:
  
   Clear traces of memory blocks allocated by Python.
  
   Does this do anything besides clear?  If not, why not just re-use the
   'clear' name from dicts?
  
  (I like the reset() name. Charles-François suggested this name
  inspired by OProfile API.)

Just reset implies to me that you're ready to start over.  Not just
traced memory blocks but accumulated statistics and any configuration
(such as Filters) would also be reset.  Also tracing would be disabled
until started explicitly.

If you want it to apply just to the traces, reset_traces() would be
more appropriate.

   disable() function:
  
   Stop tracing Python memory allocations and clear traces of
   memory blocks allocated by Python.
  
   I would disable to stop tracing, but I would not expect it to clear
   out the traces it had already captured.  If it has to do that, please
   put in some sample code showing how to save the current traces before
   disabling.
  
  For consistency, you cannot keep traces when tracing is disabled. The
  free() must be enabled to remove allocated memory blocks, or next
  malloc() may get the same address which would raise an assertion error
  (you cannot have two memory blocks at the same address).

Then I would not call this disable.  disable() should not destroy data.

  Just call get_traces() to get traces before clearing them. I can
  explain it in the doc.

Shouldn't disable() do this automatically, perhaps with an optional
discard_traces flag (which would be False by default)?

But I definitely agree with Jim:  You *must* provide an example here
showing how to save the traces (even though it's trivial to do so),
because that will make clear that disable() is a destructive
operation.  (It is not destructive in any other debugging tool that
I've used.)  Even with documentation, be prepared for user complaints.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 454 (tracemalloc) disable == clear?

2013-10-28 Thread Jim Jewett
reset() function:

Clear traces of memory blocks allocated by Python.

Does this do anything besides clear?  If not, why not just re-use the
'clear' name from dicts?


disable() function:

Stop tracing Python memory allocations and clear traces of
memory blocks allocated by Python.

I would disable to stop tracing, but I would not expect it to clear
out the traces it had already captured.  If it has to do that, please
put in some sample code showing how to save the current traces before
disabling.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com