[issue12911] Expose a private accumulator C API

2011-10-06 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset f9f782f2369e by Antoine Pitrou in branch '3.2':
Issue #12911: Fix memory consumption when calculating the repr() of huge tuples 
or lists.
http://hg.python.org/cpython/rev/f9f782f2369e

New changeset 656c13024ede by Antoine Pitrou in branch 'default':
Issue #12911: Fix memory consumption when calculating the repr() of huge tuples 
or lists.
http://hg.python.org/cpython/rev/656c13024ede

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-10-06 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

I added a comment insisting that the API is private and can be changed at any 
moment.
StringIO can actually re-use that API, rather than the reverse. No need to 
instantiate a full-blown file object when all you want to do is to join a bunch 
of strings.

--
resolution:  - fixed
stage: patch review - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-10-03 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

 It's not a container type, just a small C struct that 
 gets allocated on the stack. Think of it as a library, like stringlib.

That's what I call a container type: a structure with a library :-)

 That's another possibility. But we'd have to expose a 
 C API anyway, and this one is as good as any other.

No, it's not: it's additional clutter. If new API needs to be added,
adding it for existing structures is better. Notice that you don't
*need* new API, as you can use StringIO just fine from C also.

 Note that StringIO will copy data twice (once when calling 
 write(), once when calling getvalue()), while ''.join() only once (at 
 the end, when concatenating all strings).

Sounds like a flaw in StringIO to me. It could also manage a list of strings 
that have been written, rather than only using a flat buffer. Only when someone 
actually needs a linear buffer, it could convert it (and use a plain 
string.join when getvalue is called and there is no buffer at all).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-10-03 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  That's another possibility. But we'd have to expose a 
  C API anyway, and this one is as good as any other.
 
 No, it's not: it's additional clutter. If new API needs to be added,
 adding it for existing structures is better. Notice that you don't
 *need* new API, as you can use StringIO just fine from C also.

Yes, but using StringIO without a dedicated C API is more tedious and
quite slower.

  Note that StringIO will copy data twice (once when calling 
  write(), once when calling getvalue()), while ''.join() only once (at 
  the end, when concatenating all strings).
 
 Sounds like a flaw in StringIO to me. It could also manage a list of
 strings that have been written, rather than only using a flat buffer.
 Only when someone actually needs a linear buffer, it could convert it
 (and use a plain string.join when getvalue is called and there is no
 buffer at all).

That's what I thought as well. However, that's probably too much for a
bugfix release (and this issue is meant to allow test_bigmem to pass on
3.x).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-10-02 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

New patch implementing Martin's suggested optimization (only instantiate the 
large list when necessary).

--
Added file: http://bugs.python.org/file23299/accu3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-16 Thread John O'Connor

Changes by John O'Connor tehj...@gmail.com:


--
nosy: +jcon

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-07 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

I'm -1 on this approach; I don't think yet another container type is the right 
solution, given that we have already plenty of them.

If you want to avoid creating large lists, then the StringIO type should 
already provide that. So I wonder why these functions couldn't be rewritten to 
use StringIO.

If you really want to use this approach, I'd try to avoid allocating the large 
list if there are only few substrings. I.e. allocate it only when flushing, and 
only if the flush is not the final flush.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-07 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I'm -1 on this approach; I don't think yet another container type is
 the right solution, given that we have already plenty of them.

It's not a container type, just a small C struct that gets allocated on
the stack. Think of it as a library, like stringlib.

 If you want to avoid creating large lists, then the StringIO type
 should already provide that. So I wonder why these functions couldn't
 be rewritten to use StringIO.

That's another possibility. But we'd have to expose a C API anyway, and
this one is as good as any other.

Note that StringIO will copy data twice (once when calling write(), once
when calling getvalue()), while ''.join() only once (at the end, when
concatenating all strings).

 If you really want to use this approach, I'd try to avoid allocating
 the large list if there are only few substrings. I.e. allocate it only
 when flushing, and only if the flush is not the final flush.

That's possible, indeed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-06 Thread Antoine Pitrou

New submission from Antoine Pitrou pit...@free.fr:

In 47176e8d7060, I fixed json to not blow memory when serializing a large 
container of small objects.
It turns out that the repr() of tuple objects (and, very likely, list objects 
and possibly other containers) has the same problem. For example, Martin's 
256GB machine can't serialize a two billion-element tuple:
http://www.python.org/dev/buildbot/all/builders/AMD64%20debian%20parallel%20custom/builds/6/steps/test/logs/stdio

So I propose to expose a private C API for the accumulator pattern introduced 
in 47176e8d7060 (with, e.g., the _PyAccumulator prefix), and to use that API 
from relevant code.

--
components: Interpreter Core
messages: 143598
nosy: mark.dickinson, pitrou, rhettinger
priority: normal
severity: normal
status: open
title: Expose a private accumulator C API
type: resource usage
versions: Python 3.2, Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-06 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Here is a patch against 3.2.
In the default branch it will also help factor out some code from the _json 
module.

--
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-06 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
keywords: +patch
Added file: http://bugs.python.org/file23109/accu.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-06 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

For the record, the patch fixes the test_bigmem crashes when testing repr() on 
tuples and lists:
http://www.python.org/dev/buildbot/all/builders/AMD64%20debian%20parallel%20custom/builds/10/steps/test/logs/stdio

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12911] Expose a private accumulator C API

2011-09-06 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Updated patch (mostly cosmetic stuff) after Benjamin's comments.

--
Added file: http://bugs.python.org/file23111/accu2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12911
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com