Serhiy Storchaka added the comment:
Here is a patch which restores optimization for frame headers. Unfortunately it
breaks test_optional_frames.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
Larry Hastings added the comment:
Isn't it a little late to be changing the pickle protocol, now that we've hit
feature-freeze? If you want to check something like this in you're going to
have to make a good case for it.
--
___
Python tracker
Changes by Serhiy Storchaka storch...@gmail.com:
Added file: http://bugs.python.org/file32840/pickle_frame_headers.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Serhiy Storchaka added the comment:
This doesn't change the pickle protocol. This is just an implementation detail.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Alexandre Vassalotti added the comment:
Optimizing the output of the pickler class should be fine during the feature
freeze as long the semantics of the current opcodes stay unchanged.
--
___
Python tracker rep...@bugs.python.org
Antoine Pitrou added the comment:
Well, Larry may expand, but I think we don't commit performance optimizations
during the feature freeze either.
(feature is taken in the same sense as in no new features in the bugfix
branches)
--
___
Python
Larry Hastings added the comment:
I'll make you a deal. As long as the protocol remains 100% backwards and
forwards compatible (3.4.0b1 can read anything written by trunk, and trunk can
read anything written by 3.4.0b1), you can make optimizations until beta 2.
After that you have to
Serhiy Storchaka added the comment:
I have opened separate issue19780 for this.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Roundup Robot added the comment:
New changeset 992ef855b3ed by Antoine Pitrou in branch 'default':
Issue #17810: Implement PEP 3154, pickle protocol 4.
http://hg.python.org/cpython/rev/992ef855b3ed
--
nosy: +python-dev
___
Python tracker
Antoine Pitrou added the comment:
I've now committed Alexandre's latest work (including the FRAME and MEMOIZE
opcodes).
--
resolution: - fixed
stage: patch review - committed/rejected
___
Python tracker rep...@bugs.python.org
Roundup Robot added the comment:
New changeset d719975f4d25 by Christian Heimes in branch 'default':
Issue #17810: Add NULL check to save_frozenset
http://hg.python.org/cpython/rev/d719975f4d25
--
___
Python tracker rep...@bugs.python.org
Roundup Robot added the comment:
New changeset c54becd69805 by Christian Heimes in branch 'default':
Issue #17810: return -1 on error
http://hg.python.org/cpython/rev/c54becd69805
--
___
Python tracker rep...@bugs.python.org
Roundup Robot added the comment:
New changeset a02adfb3260a by Christian Heimes in branch 'default':
Issue #17810: Add two missing error checks to save_global
http://hg.python.org/cpython/rev/a02adfb3260a
--
___
Python tracker rep...@bugs.python.org
Roundup Robot added the comment:
New changeset 3e16c8c34e69 by Christian Heimes in branch 'default':
Issue #17810: Fixed NULL check in _PyObject_GetItemsIter()
http://hg.python.org/cpython/rev/3e16c8c34e69
--
___
Python tracker rep...@bugs.python.org
Alexandre Vassalotti added the comment:
I've finalized the framing implementation in de9bda43d552.
There will be more improvements to come until 3.4 final. However, feature-wise
we are done. Thank you everyone for the help!
--
status: open - closed
Tim Peters added the comment:
[Alexandre Vassalotti]
I've finalized the framing implementation in de9bda43d552.
There will be more improvements to come until 3.4 final. However, feature-wise
we are done. Thank you everyone for the help!
Woo hoo! Thank YOU for the hard work - I know how
Changes by Antoine Pitrou pit...@free.fr:
--
nosy: +larry
priority: high - release blocker
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Serhiy Storchaka added the comment:
I propose to include frame size in previous frame. This will twice decrease the
number of file reads.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
Martin v. Löwis added the comment:
Attached is a patch that takes a different approach to framing, putting it into
an optional framing layer by means of a buffered reader/writer.
The framing structure is the same as in PEP 3154; a separate PYFRAMES magic is
prepended to guard against protocol
Alexandre Vassalotti added the comment:
I have been looking again at Stefan's previous proposal of making memoization
implicit in the new pickle protocol. While I liked the smaller pickles it
produced, I didn't the invasiveness of the implementation, which requires a
change for almost every
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Added file: http://bugs.python.org/file32639/f87b455af573.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Added file: http://bugs.python.org/file32640/8434af450da0.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Removed file: http://bugs.python.org/file32639/f87b455af573.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Alexandre Vassalotti added the comment:
Hi folks,
I consider my implementation of PEP-3154 mostly feature complete at this point.
I still have a few things left to do. For example, I need to update the
documentation about the new protocol. However, these can mostly be done along
the review
Alexandre Vassalotti added the comment:
I am still working on it. I am implemented support for nested globals last week
(http://hg.python.org/features/pep-3154-alexandre/rev/c8991b32a47e). At this
point, the only big piece remaining is the support for method descriptors.
There are other minor
Antoine Pitrou added the comment:
Alexandre, Stefan, is any of you working on this?
If not, could you please expose what the status of the patch is, whose work is
the most advanced (Alexandre's or Stefan's) and what should be the plan to move
this forward?
Thanks!
--
Nick Coghlan added the comment:
Potentially relevant to this: we hope to have PEP 451 done for 3.4, which adds
a __spec__ attribute to module objects, and will also tweak runpy to ensure -m
registers __main__ under it's real name as well.
If pickle uses __spec__.name in preference to __name__
Alexandre Vassalotti added the comment:
Stefan, could you address my review comments soon? The improved support for
globals is the only big piece missing from the implementation of PEP, which I
would like to get done and submitted by the end of the month.
--
Stefan Mihaila added the comment:
On 6/3/2013 9:33 PM, Alexandre Vassalotti wrote:
Alexandre Vassalotti added the comment:
Stefan, could you address my review comments soon? The improved support for
globals is the only big piece missing from the implementation of PEP, which I
would like
Alexandre Vassalotti added the comment:
Stefan, I took a quick look at your patch. There is a couple things that stands
out.
First, I think the implementation of BINGLOBAL and BINGLOBAL_BIG should be
moved to another patch. Adding a binary version opcode for GLOBAL is a separate
feature and
Antoine Pitrou added the comment:
Stefan, I took a quick look at your patch. There is a couple things
that stands out.
It would be nice if you could reconcile each other's work. Especially so
I don't re-implement framing on top of something else :-)
Adding a binary version opcode for GLOBAL
Alexandre Vassalotti added the comment:
Thanks Stefan for the patch. It's very much appreciated. I will try to review
it soon.
Of the features you proposed, the twos I would like to take a look again is
implicit memoization and the BAIL_OUT opcode. For the implicit memoization
feature, we
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Removed file: http://bugs.python.org/file30229/pickle4+methods.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Changes by Stefan Mihaila mstefa...@gmail.com:
--
nosy: +mstefanro
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Python-bugs-list
Changes by Stefan Mihaila mstefa...@gmail.com:
Added file: http://bugs.python.org/file30211/780722877a3e.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Changes by Stefan Mihaila mstefa...@gmail.com:
Removed file: http://bugs.python.org/file30211/780722877a3e.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Stefan Mihaila added the comment:
On 5/10/2013 11:46 PM, Stefan Mihaila wrote:
Changes by Stefan Mihaila mstefa...@gmail.com:
--
nosy: +mstefanro
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
Antoine Pitrou added the comment:
Antoine, can you share the code for your benchmarks which show
performance improvements when framing is enabled? I am seeing the same
10-15% slowdown even when pickling stuff to pure Python objects:
The performance improvement is when unpickling, not when
Antoine Pitrou added the comment:
Here are some numbers:
# Without the patch
$ ./python -m timeit -s import pickle, io; d=pickle.dumps(list(range(1000)),
4); b=io.BytesIO(d) b.seek(0); pickle.load(b)
1 loops, best of 3: 180 usec per loop
$ ./python -m timeit -s import pickle, _pyio as
Alexandre Vassalotti added the comment:
I am currently fleshing out an improved implementation for the reduce protocol
version 4. One thing I am curious about is whether we should keep the special
cases we currently have there for dict and list subclasses.
I recall Raymond expressed
Antoine Pitrou added the comment:
I think it could be worthwhile to investigate a generic API for
pickling collections in-place. For example, a such API would helpful
for pickling set subclasses in-place.
Is the use case important enough? Otherwise, this is more
__special_method__
Alexandre Vassalotti added the comment:
Those methods wouldn't be much more a maintenance burden than the special cases
already present in the implementation of __reduce__. These methods would only
need to be provided by classes that wishes to support efficient in-place
pickling provided by
Antoine Pitrou added the comment:
Here is an updated framing patch which fixes the issue reported by Alexandre.
There are also a couple added tests.
--
Added file: http://bugs.python.org/file30118/framing3.patch
___
Python tracker
Alexandre Vassalotti added the comment:
The framing patch seems to have a significant negative effect on performance.
Report on Linux avassalotti 3.2.5-gg1130 #1 SMP Mon Feb 4 02:25:47 PST 2013
x86_64 x86_64
Total CPU cores: 12
### fastpickle ###
Min: 0.447194 - 0.505841: 1.13x slower
Avg:
Antoine Pitrou added the comment:
The framing patch seems to have a significant negative effect on
performance.
I wouldn't call it significant. Any speedup or slowdown less than 50% is
unlikely to be noticeable in real-world applications.
Mitigating the regression is probably a matter of
Alexandre Vassalotti added the comment:
Antoine, can you share the code for your benchmarks which show performance
improvements when framing is enabled? I am seeing the same 10-15% slowdown even
when pickling stuff to pure Python objects:
### Without the patch
./python -m timeit -r 50 -s
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Refactor reduce protocol implementation
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Alexandre Vassalotti added the comment:
The latest framing patch looks pretty nice overall. One concern is we need to
make sure the C implementation call _Pickler_OpcodeBoundary often enough to
keep the frames around the sizes. For example, batch_save_list and
batch_save_dict can currently
Antoine Pitrou added the comment:
One concern is we need to make sure the C implementation call
_Pickler_OpcodeBoundary often enough to keep the frames around the
sizes. For example, batch_save_list and batch_save_dict can currently
create a frame much larger than expected.
I don't
Alexandre Vassalotti added the comment:
I don't understand how that can happen. batch_list() and batch_dict()
both call save() for each item, and save() calls
_Pickler_OpcodeBoundary() at the end. Have I missed something?
Ah, you're right. I was thinking in terms of my fast dispatch patch in
Antoine Pitrou added the comment:
Here is an updated framing patch which supports pickletools.optimize().
--
Added file: http://bugs.python.org/file30094/framing2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
Antoine Pitrou added the comment:
Here is a framing patch on top of Alexandre's work.
There is one thing that framing breaks: pickletools.optimize(). I think it
would be non-trivial to fix it. Perhaps the PREFETCH opcode is a better idea
for this.
Alexandre, I don't understand why you
Serhiy Storchaka added the comment:
What is wrong with GLOBAL?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Python-bugs-list
Antoine Pitrou added the comment:
What is wrong with GLOBAL?
It uses the lame text mode that scans for newlines, and is generally
annoying to optimize. This is like C strings vs. Pascal strings.
http://www.python.org/dev/peps/pep-3154/#binary-encoding-for-all-opcodes
--
Serhiy Storchaka added the comment:
With framing it isn't annoying.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Alexandre Vassalotti added the comment:
Antoine, I removed STACK_GLOBAL when I found performance issues with the
implementation. The changeset that added it had some unrelated changes that
made it harder to debug than necessary. I am planning to re-add it when I
worked out the kinks.
Antoine Pitrou added the comment:
With framing it isn't annoying.
Slightly less, but you still have to wrap readline() calls in the
unpickler.
I have started experimenting with PREFETCH, but making the opcode
optional is a bit annoying in the C pickler, which means it's simpler to
always emit
Antoine Pitrou added the comment:
And here is an implementation of PREFETCH over Alexandre's work.
As you can see the code complexity compared to framing is mostly a wash, but I
think fixing pickletools.optimize() will be easier with PREFETCH (still needs
confirmation, of course :-)).
Serhiy Storchaka added the comment:
I were thinking about framing before looking at your last changes to PEP 3154
and I have two alternative propositions.
1. Pack picked items in blocks of some predefined (or specified at the start
with the BLOCKSIZE opcode) size. Only some large data (long
Antoine Pitrou added the comment:
1. Pack picked items in blocks of some predefined (or specified at the
start with the BLOCKSIZE opcode) size. Only some large data (long
strings, large integers) can cross the boundary between blocks. In all
other cases the block should be padded with the
Serhiy Storchaka added the comment:
Padding makes it both less efficient and more annoying to handle, IMO.
Agree. But there is other application for NOPs. UTF-8 decoder (and some other
decoders) works more fast (up to 4x) when input is aligned. By adding several
NOPs before BINUNICODE so
Charles-François Natali added the comment:
I would like to see Proto4 include an option for compression
(zlib,bz2) or somesuch and become self-decompressing upon unpickling.
I don't see what this would bring over explicit compression:
- depending on the use case, you may want to use different
Antoine Pitrou added the comment:
I don't see what this would bring over explicit compression:
- depending on the use case, you may want to use different compression
algorithms, e.g. for disk you may want higher compression ratio like
bzip2/lzma, but for wire you'd prefer something fast
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Antoine Pitrou added the comment:
A proof of concept hack to enable framing on pickle showed a massive
performance increase on streaming unpickling (up to 5x faster with a C file
object such as io.BytesIO, up to 150x faster with a pure Python file object
such as _pyio.BytesIO). There is a
Antoine Pitrou added the comment:
(note: I've updated PEP 3154 with framing and GLOBAL_STACK)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Serhiy Storchaka added the comment:
A feature that may be actually nice to have in the pickle protocol would
be some framing, to help with streaming unpickling (right now unpickling
a stream can read almost one byte at a time, IIRC).
However, that would also make the protocol and the pickler
Antoine Pitrou added the comment:
What if just use io.BufferedReader?
if not isinstance(file, io.BufferedReader):
file = io.BufferedReader(file)
(at start of _Unpickler.__init__)
Two problems:
1. semantically, it is wrong; the BufferedReader will read bytes beyond
the
Raymond Hettinger added the comment:
I would like to see Proto4 include an option for compression (zlib,bz2) or
somesuch and become self-decompressing upon unpickling. The primary use cases
for pickling involve writing to disk or transmitting across a wire -- both use
cases benefit from
New submission from Alexandre Vassalotti:
I have restarted the work on PEP 3154. Stefan Mihaila had begun an
implementation as part of the Google Summer of Code 2012. Unfortunately, he hit
multiple roadblocks which prevented him to finish his work by the end of the
summer. He previously shown
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Unbinding of methods
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Changes by Andrew Svetlov andrew.svet...@gmail.com:
--
nosy: +asvetlov
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Changes by Antoine Pitrou pit...@free.fr:
--
keywords: +patch
Added file: http://bugs.python.org/file29966/9f1be171da08.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
Antoine Pitrou added the comment:
Thank you for reviving this :)
A couple of questions:
- why ADDITEM in addition to ADDITEMS? I don't think single-element sets are an
important use case (as opposed to, say, single-element tuples)
- what is the purpose of STACK_GLOBAL? I would say memoization
Changes by Serhiy Storchaka storch...@gmail.com:
--
nosy: +serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Serhiy Storchaka added the comment:
Link to the previous attempt: issue15642.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
___
___
Serhiy Storchaka added the comment:
Memoization consumes memory during pickling. For now every memoized object
requires memory for:
dict's entity;
an id() integer object;
a 2-element tuple;
a pickle's index (an integer object).
It's about 80 bytes on 32-bit platform (and twice as this on
Antoine Pitrou added the comment:
Memoization consumes memory during pickling. For now every memoized
object requires memory for:
dict's entity;
an id() integer object;
a 2-element tuple;
a pickle's index (an integer object).
It's about 80 bytes on 32-bit platform (and twice as this
78 matches
Mail list logo