Re: [Python-Dev] Completing the email6 API changes.

2013-09-03 Thread R. David Murray
On Tue, 03 Sep 2013 10:56:36 +0900, "Stephen J. Turnbull"  
wrote:
> R. David Murray writes:
>  > I can understand the structure Glen found in Applemail:
>  > a series of text/plain parts interspersed with image/jpg, with all parts
>  > after the first being marked 'Contentent-Disposition: inline'.  Any MUA
>  > that can display text and images *ought* to handle that correctly and
>  > produce the expected result.  But that isn't what your structure above
>  > would produce.  If you did:
>  > 
>  > multipart/related
>  > multipart/alternative
>  > text/html
>  > text/plain
>  > image/png
>  > text/plain
>  > image/png
>  > text/plain
>  > 
>  > and only referred to the png parts in the text/html part and marked all
>  > the parts as 'inline' (even though that is irrelevant in the text/html
>  > related case), an MUA that *knew* about this technique *could* display it
>  > "correctly", but an MUA that is just following the standards most
>  > likely won't.
> 
> OK, I see that now.  It requires non-MIME information about the
> treatment of the root entity by the implementation.  On the other
> hand, it shouldn't *hurt*.  RFC 2387 explicitly specifies that at
> least some parts of a contained multipart/related part should be able
> to refer to entities related via the containing multipart/related.
> Since it does not mention *any* restrictions on contained root
> entities, I take it that it implicitly specifies that any contained
> multipart may make such references.  But I suspect it's not
> implemented by most MUAs.  I'll have to test.

OK, I see what you are driving at now.  Whether or not it works is
dependent on whether or not typical MUAs handle a multipart/related with
a text/plain root part by treating it as if it were a multipart/mixed
with inline or attachment sub-parts.  So yes, whether or not we should
support and/or document this technique very much depends on whether or
not typical MUAs do so.  I will, needless to say, be very interested in
the results of your research :)

--David
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-03 Thread R. David Murray
On Tue, 03 Sep 2013 10:01:42 -0400, "R. David Murray"  
wrote:
> On Tue, 03 Sep 2013 10:56:36 +0900, "Stephen J. Turnbull" 
>  wrote:
> > R. David Murray writes:
> >  > I can understand the structure Glen found in Applemail:
> >  > a series of text/plain parts interspersed with image/jpg, with all parts
> >  > after the first being marked 'Contentent-Disposition: inline'.  Any MUA
> >  > that can display text and images *ought* to handle that correctly and
> >  > produce the expected result.  But that isn't what your structure above
> >  > would produce.  If you did:
> >  > 
> >  > multipart/related
> >  > multipart/alternative
> >  > text/html
> >  > text/plain
> >  > image/png
> >  > text/plain
> >  > image/png
> >  > text/plain
> >  > 
> >  > and only referred to the png parts in the text/html part and marked all
> >  > the parts as 'inline' (even though that is irrelevant in the text/html
> >  > related case), an MUA that *knew* about this technique *could* display it
> >  > "correctly", but an MUA that is just following the standards most
> >  > likely won't.
> > 
> > OK, I see that now.  It requires non-MIME information about the
> > treatment of the root entity by the implementation.  On the other
> > hand, it shouldn't *hurt*.  RFC 2387 explicitly specifies that at
> > least some parts of a contained multipart/related part should be able
> > to refer to entities related via the containing multipart/related.
> > Since it does not mention *any* restrictions on contained root
> > entities, I take it that it implicitly specifies that any contained
> > multipart may make such references.  But I suspect it's not
> > implemented by most MUAs.  I'll have to test.
> 
> OK, I see what you are driving at now.  Whether or not it works is
> dependent on whether or not typical MUAs handle a multipart/related with
> a text/plain root part by treating it as if it were a multipart/mixed

I meant "a text/plain root part *inside* a multipart/alternative", which
is what you said, I just didn't understand it at first :)  Although I
wonder how many GUI MUAs do the fallback to multipart/mixed with just a
normal text/plain root part, too.  I would expect a text-only MUA would,
since it has no other way to display a multipart/related...but a
graphical MUA might just assume that there will always be an html part
in a multipart/related.

> with inline or attachment sub-parts.  So yes, whether or not we should
> support and/or document this technique very much depends on whether or
> not typical MUAs do so.  I will, needless to say, be very interested in
> the results of your research :)
> 
> --David
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/rdmurray%40bitdance.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python 2.6 to end support with 2.6.9 in October 2013

2013-09-03 Thread Barry Warsaw
Hello Pythonistas,

Python 2.6.9 is the last planned release of the 2.6.x series.  This will be a
security-only source-only release.  It is currently scheduled for October
2013, and after this Python 2.6 will have reached its end-of-life and the
branch will be retired.

http://www.python.org/dev/peps/pep-0361/

I would like to release 2.6.9rc1 on Monday, September 30th, and 2.6.9 final on
Monday, October 28th.  I've added both dates to the Python calendar.

Here are the list of candidates still to be fixed for 2.6.9:

- 18747 - Re-seed OpenSSL's PRNG after fork
- 16037 - httplib: header parsing is not delimited
- 16038 - ftplib: unlimited readline() from connection
- 16039 - imaplib: unlimited readline() from connection
- 16040 - nntplib: unlimited readline() from connection
- 16041 - poplib: unlimited readline() from connection
- 16042 - smtplib: unlimited readline() from connection
- 16043 - xmlrpc: gzip_decode has unlimited read()

These were the ones I previously had on my list, and I've now marked these all
as release blockers for 2.6.9... for now.

If you know of any others that I should be aware of, please let me know.

If you can contribute to 2.6.9 by reviewing, testing, developing, or
commenting, that would be greatly appreciated.  I will be spending some time
triaging these and any other issues that get identified as possible 2.6.9
candidates.

If you have any questions regarding 2.6.9, please contact me via mailing list
or IRC.

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.6 to end support with 2.6.9 in October 2013

2013-09-03 Thread Ryan
I'm still waiting on Python 2.7 for Android! Stuck on 2.6 for now...ugh!

Wonder if I can build it myself...

Barry Warsaw  wrote:

>Hello Pythonistas,
>
>Python 2.6.9 is the last planned release of the 2.6.x series.  This
>will be a
>security-only source-only release.  It is currently scheduled for
>October
>2013, and after this Python 2.6 will have reached its end-of-life and
>the
>branch will be retired.
>
>http://www.python.org/dev/peps/pep-0361/
>
>I would like to release 2.6.9rc1 on Monday, September 30th, and 2.6.9
>final on
>Monday, October 28th.  I've added both dates to the Python calendar.
>
>Here are the list of candidates still to be fixed for 2.6.9:
>
>- 18747 - Re-seed OpenSSL's PRNG after fork
>- 16037 - httplib: header parsing is not delimited
>- 16038 - ftplib: unlimited readline() from connection
>- 16039 - imaplib: unlimited readline() from connection
>- 16040 - nntplib: unlimited readline() from connection
>- 16041 - poplib: unlimited readline() from connection
>- 16042 - smtplib: unlimited readline() from connection
>- 16043 - xmlrpc: gzip_decode has unlimited read()
>
>These were the ones I previously had on my list, and I've now marked
>these all
>as release blockers for 2.6.9... for now.
>
>If you know of any others that I should be aware of, please let me
>know.
>
>If you can contribute to 2.6.9 by reviewing, testing, developing, or
>commenting, that would be greatly appreciated.  I will be spending some
>time
>triaging these and any other issues that get identified as possible
>2.6.9
>candidates.
>
>If you have any questions regarding 2.6.9, please contact me via
>mailing list
>or IRC.
>
>Cheers,
>-Barry
>
>
>
>
>___
>Python-Dev mailing list
>[email protected]
>https://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe:
>https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] RFC: PEP 454: Add a new tracemalloc module

2013-09-03 Thread Victor Stinner
Hi,

Antoine Pitrou suggested me to write a PEP to discuss the API of the
new tracemalloc module that I proposed to add to Python 3.4. Here you
have.

If you prefer to read the HTML version:
http://www.python.org/dev/peps/pep-0454/

See also the documentation of the current implementation of the module.
http://hg.python.org/features/tracemalloc/file/tip/Doc/library/tracemalloc.rst

The documentaion contains examples and a short "tutorial".


PEP: 454
Title: Add a new tracemalloc module to trace Python memory allocations
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 3-September-2013
Python-Version: 3.4


Abstract


Add a new ``tracemalloc`` module to trace Python memory allocations.



Rationale
=

Common debug tools tracing memory allocations read the C filename and
number.  Using such tool to analyze Python memory allocations does not
help because most memory allocations are done in the same C function,
``PyMem_Malloc()`` for example.

There are debug tools dedicated to the Python languages like ``Heapy``
and ``PySizer``. These projects analyze objects type and/or content.
These tools are useful when the most memory leak are instances of the
same type and this type in allocated only in a few functions. The
problem is when the object type is very common like ``str`` or
``tuple``, and it is hard to identify where these objects are allocated.

Finding reference cycles is also a difficult task. There are different
tools to draw a diagram of all references. These tools cannot be used
huge on large applications with thousands of objects because the diagram
is too huge to be analyzed manually.


Proposal


Using the PEP 445, it becomes easy to setup an hook on Python memory
allocators. The hook can inspect the current Python frame to get the
Python filename and line number.

This PEP proposes to add a new ``tracemalloc`` module. It is a debug
tool to trace memory allocations made by Python. The module provides the
following information:

* Statistics on Python memory allocations per Python filename and line
  number: size, number, and average size of allocations
* Compute differences between two snapshots of Python memory allocations
* Location of a Python memory allocation: size in bytes, Python filename
  and line number


Command line options


The ``python -m tracemalloc`` command can be used to analyze and compare
snapshots. The command takes a list of snapshot filenames and has the
following options.

``-g``, ``--group-per-file``

Group allocations per filename, instead of grouping per line number.

``-n NTRACES``, ``--number NTRACES``

Number of traces displayed per top (default: 10).

``--first``

Compare with the first snapshot, instead of comparing with the
previous snapshot.

``--include PATTERN``

Only include filenames matching pattern *PATTERN*. The option can be
specified multiple times.

See ``fnmatch.fnmatch()`` for the syntax of patterns.

``--exclude PATTERN``

Exclude filenames matching pattern *PATTERN*. The option can be
specified multiple times.

See ``fnmatch.fnmatch()`` for the syntax of patterns.

``-S``, ``--hide-size``

Hide the size of allocations.

``-C``, ``--hide-count``

Hide the number of allocations.

``-A``, ``--hide-average``

Hide the average size of allocations.

``-P PARTS``, ``--filename-parts=PARTS``

Number of displayed filename parts (default: 3).

``--color``

Force usage of colors even if ``sys.stdout`` is not a TTY device.

``--no-color``

Disable colors if ``sys.stdout`` is a TTY device.


API
===

To trace the most Python memory allocations, the module should be
enabled as early as possible in your application by calling
``tracemalloc.enable()`` function, by setting the ``PYTHONTRACEMALLOC``
environment variable to ``1``, or  by using ``-X tracemalloc`` command
line option.


Functions
-

``enable()`` function:

Start tracing Python memory allocations.

``disable()`` function:

Stop tracing Python memory allocations and stop the timer started by
``start_timer()``.

``is_enabled()`` function:

Get the status of the module: ``True`` if it is enabled, ``False``
otherwise.

``get_object_address(obj)`` function:

Get the address of the memory block of the specified Python object.

``get_object_trace(obj)`` function:

Get the trace of a Python object *obj* as a ``trace`` instance.

Return ``None`` if the tracemalloc module did not save the location
when the object was allocated, for example if the module was
disabled.

``get_process_memory()`` function:

Get the memory usage of the current process as a meminfo namedtuple
with two attributes:

* ``rss``: Resident Set Size in bytes
* ``vms``: size of the virtual memory in bytes

Return ``None`` if the platform is not supported.

Use the ``psutil`` module if

Re: [Python-Dev] RFC: PEP 454: Add a new tracemalloc module

2013-09-03 Thread Victor Stinner
> ``get_object_trace(obj)`` function:
>
> Get the trace of a Python object *obj* as a ``trace`` instance.
>
> Return ``None`` if the tracemalloc module did not save the location
> when the object was allocated, for example if the module was
> disabled.

This function and get_traces() can be reused by other debug tools like
Heapy and objgraph to add where objects were allocated.

> ``get_stats()`` function:
>
> Get statistics on Python memory allocations per Python filename and
> per Python line number.
>
> Return a dictionary
> ``{filename: str -> {line_number: int -> stats: line_stat}}``
> where *stats* in a ``line_stat`` instance. *filename* and
> *line_number* can be ``None``.
>
> Return an empty dictionary if the tracemalloc module is disabled.
>
> ``get_traces(obj)`` function:
>
>Get all traces of a Python memory allocations.
>Return a dictionary ``{pointer: int -> trace}`` where *trace*
>is a ``trace`` instance.
>
>Return an empty dictionary if the ``tracemalloc`` module is disabled.

get_stats() can computed from get_traces(), example:
-
import pprint, tracemalloc

traces = tracemalloc.get_traces()
stats = {}
for trace in traces.values():
if trace.filename not in stats:
stats[trace.filename] = line_stats = {}
else:
line_stats = stats[trace.filename]
if trace.lineno not in line_stats:
line_stats[trace.lineno] = line_stat = tracemalloc.line_stat((0, 0))
size = trace.size
count = 1
else:
line_stat = line_stats[trace.lineno]
size = line_stat.size + trace.size
count = line_stat.count + 1
line_stats[trace.lineno] = tracemalloc.line_stat((size, count))

pprint.pprint(stats)
-

The problem is the efficiency. At startup, Python already allocated
more than 20,000 memory blocks:

$ ./python -X tracemalloc -c 'import tracemalloc;
print(len(tracemalloc.get_traces()))'
21704

At the end of the Python test suite, Python allocated more than
500,000 memory blocks.

Storing all these traces in a snapshot eats a lot of memory, disk
space and uses CPU to build the statistics.

> ``start_timer(delay: int, func: callable, args: tuple=(), kwargs:
> dict={})`` function:
>
> Start a timer calling ``func(*args, **kwargs)`` every *delay*
> seconds. (...)
>
> If ``start_timer()`` is called twice, previous parameters are
> replaced.  The timer has a resolution of 1 second.
>
> ``start_timer()`` is used by ``DisplayTop`` and ``TakeSnapshot`` to
> run regulary a task.

So DisplayTop and TakeSnapshot cannot be used at the same time. It
would be convinient to be able to register more than one function.
What do you think?

> ``trace`` class:
> This class represents debug information of an allocated memory block.
>
> ``size`` attribute:
> Size in bytes of the memory block.
> ``filename`` attribute:
> Name of the Python script where the memory block was allocated,
> ``None`` if unknown.
> ``lineno`` attribute:
> Line number where the memory block was allocated, ``None`` if
> unknown.

I though twice and it would be posible to store more than 1 frame per
trace instance, to be able to rebuild a (partial) Python traceback.
The hook on the memory allocator has access to the chain of Python
frames. The API should be changed to support such enhancement.

> ``DisplayTop(count: int=10, file=sys.stdout)`` class:
> Display the list of the *count* biggest memory allocations into
> *file*.
> (...)
> ``group_per_file`` attribute:
>
> If ``True``, group memory allocations per Python filename. If
> ``False`` (default value), group allocation per Python line number.

This attribute is very important. We may add it to the constructor.

By the way, the self.stream attribute is not documented.

> Snapshot class
> --
>
> ``Snapshot()`` class:
>
> Snapshot of Python memory allocations.
>
> Use ``TakeSnapshot`` to take regulary snapshots.
>
> ``create(user_data_callback=None)`` method:
>
> Take a snapshot. If *user_data_callback* is specified, it must be a
> callable object returning a list of
> ``(title: str, format: str, value: int)``.
> *format* must be ``'size'``. The list must always have the same
> length and the same order to be able to compute differences between
> values.
>
> Example: ``[('Video memory', 'size', 234902)]``.

(Oops, create() is a class method, not a method.)

Having to call a class method to build an instance of a class is
surprising. But I didn't find a way to implement the load() class
method otherwise.

The user_data_callback API can be improved. The "format must be size"
is not very convinient.

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-03 Thread Stephen J. Turnbull
R. David Murray writes:

 > I meant "a text/plain root part *inside* a multipart/alternative", which
 > is what you said, I just didn't understand it at first :)  Although I
 > wonder how many GUI MUAs do the fallback to multipart/mixed with just a
 > normal text/plain root part, too.  I would expect a text-only MUA would,
 > since it has no other way to display a multipart/related...but a
 > graphical MUA might just assume that there will always be an html part
 > in a multipart/related.

It's not really a problem with text vs. GUI, or an assumption of HMTL.
There are plenty of formats that have such links, and some which don't
have links, but rather assigned roles such as "Mac files" (with data
fork and resource fork) and digital signatures (though that turned out
to be worth designing a new multipart subtype).

The problem is that "multipart/related" says "pass all the part
entities to the handler appropriate to the root part entity, which
will process the links found in the root part entity".  If you
implement that in the natural way, you just pass the text/plain part
to the text/plain handler, which won't find any links for the simple
reason that it has no protocol for representing them.

This means that the kind of multipart/related handler I envision needs
to implement linking itself, rather than delegate them to the root
part handler.  This requires checking the type of the root part:

# not intended to look like Email API
def handle_multipart_related (part_list, root_part):
if root_part.content_type in ['text/plain']:
# just display the parts in order
handle_multipart_mixed (part_list)
else:
# cid -> entities in internal representation
entity_map = extract_entity_map(part_list)
root_part.content_type.handle(root_part, entity_map)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com