Alexandre Vassalotti added the comment:
Our hands are pretty much tied here. The pickle bytearray as unicode hack is
likely the best we can do without pickling compatibility between Python 2 and
3. I can't think of a solution that could work here. For example.
1. Pickling bytearrays
Alexandre Vassalotti added the comment:
I am currently fleshing out an improved implementation for the reduce protocol
version 4. One thing I am curious about is whether we should keep the special
cases we currently have there for dict and list subclasses.
I recall Raymond expressed
Alexandre Vassalotti added the comment:
Those methods wouldn't be much more a maintenance burden than the special cases
already present in the implementation of __reduce__. These methods would only
need to be provided by classes that wishes to support efficient in-place
pickling provided
Alexandre Vassalotti added the comment:
The framing patch seems to have a significant negative effect on performance.
Report on Linux avassalotti 3.2.5-gg1130 #1 SMP Mon Feb 4 02:25:47 PST 2013
x86_64 x86_64
Total CPU cores: 12
### fastpickle ###
Min: 0.447194 - 0.505841: 1.13x slower
Avg
Alexandre Vassalotti added the comment:
Do you have benchmark results to show the code with the patch is faster?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17897
Alexandre Vassalotti added the comment:
Antoine, can you share the code for your benchmarks which show performance
improvements when framing is enabled? I am seeing the same 10-15% slowdown even
when pickling stuff to pure Python objects:
### Without the patch
./python -m timeit -r 50 -s
New submission from Alexandre Vassalotti:
The changeset 2dd046be2c88 introduced _PyObject_CallMethodObjIdArgs. This API
should have been named _PyObject_CallMethodIdObjArgs since it is the variant of
_PyObject_CallMethodId which takes object arguments instead of building
arguments from
New submission from Alexandre Vassalotti:
I have tried to clean up a bit of the implementation of the reduce protocol in
typeobject.c in preparation for PEP 3154's support of classes with __new__
using keyword-only arguments. I am not quite done yet with the refactorings,
but I would
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Refactor reduce protocol implementation
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
Alexandre Vassalotti added the comment:
The latest framing patch looks pretty nice overall. One concern is we need to
make sure the C implementation call _Pickler_OpcodeBoundary often enough to
keep the frames around the sizes. For example, batch_save_list and
batch_save_dict can currently
Alexandre Vassalotti added the comment:
I don't understand how that can happen. batch_list() and batch_dict()
both call save() for each item, and save() calls
_Pickler_OpcodeBoundary() at the end. Have I missed something?
Ah, you're right. I was thinking in terms of my fast dispatch patch
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Implement PEP 3154 (pickle protocol 4)
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4727
Alexandre Vassalotti added the comment:
It is an integer overflow issue. It is easy to reproduce without numpy:
$ python2.7 -c import cPickle; cPickle.dumps('\x00' * 2**31)
Traceback (most recent call last):
File string, line 1, in module
SystemError: error return without exception set
We
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Implement PEP 3154 (pickle protocol 4)
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9276
Alexandre Vassalotti added the comment:
If you are pickling large objects, you're likely hitting issue #11872. We fixed
most 64-bit issues in Python 3, so upgrading might be solution if possible.
Since the particular bug you are hitting cannot be reproduced with your test
case, I am closing
Alexandre Vassalotti added the comment:
Without a test case, we cannot tell if this is a bug in pickle or not. Anyhow,
Floris's explanation is pretty much on the dot as why you might see this error.
--
nosy: +alexandre.vassalotti
resolution: - works for me
stage: - committed/rejected
Alexandre Vassalotti added the comment:
There is no guarantee the binary representation of pickled data will be same
between different runs. We try to make it mostly consistent when we can, but
there are cases, like this one, where we cannot ensure consistency without
hurting performance
Alexandre Vassalotti added the comment:
I fixed this while working on PEP 3154
[http://hg.python.org/features/pep-3154-alexandre/rev/eed9142d664f]. The
relevant piece is
@@ -420,7 +424,13 @@ class _Pickler:
write(REDUCE)
if obj is not None:
-self.memoize
Alexandre Vassalotti added the comment:
Antoine, I removed STACK_GLOBAL when I found performance issues with the
implementation. The changeset that added it had some unrelated changes that
made it harder to debug than necessary. I am planning to re-add it when I
worked out the kinks
Alexandre Vassalotti added the comment:
I think it should be fixed though it is not a high-priority issue.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16231
New submission from Alexandre Vassalotti:
I have restarted the work on PEP 3154. Stefan Mihaila had begun an
implementation as part of the Google Summer of Code 2012. Unfortunately, he hit
multiple roadblocks which prevented him to finish his work by the end of the
summer. He previously shown
Alexandre Vassalotti added the comment:
I have started a new implementation of PEP 3154 since Stefan hasn't been active
on his. Moving the discussion to Issue #17810.
--
dependencies: -Unbinding of methods
resolution: - out of date
stage: patch review - committed/rejected
status
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Unbinding of methods
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17810
Alexandre Vassalotti added the comment:
Here's a patch to fix the exception.
--
keywords: +patch
Added file: http://bugs.python.org/file29949/fix_array_err_msg.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3693
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17785
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17720
Alexandre Vassalotti added the comment:
Are you asking why do we need to call both PyMemoTable_Get and memo_get? Or,
why do we fetching the memo was moved to the save functions? For the former,
there is no real reason. The extra call could be removed though profiling
doesn't show this call
Alexandre Vassalotti added the comment:
In addition, we could bring back the switch dispatch based on the first letter
of the type name. It does seem to speed up things as well but as much as the
type cache optimization.
--
nosy: +pitrou, serhiy.storchaka
title: Optimize pickling
New submission from Alexandre Vassalotti:
As noted previously, TinyURL URL shortening API is rather slow and not always
available. Each requests takes between 0.5 and 1.5 seconds.
We should change it to use Google URL Shortener which returns a response within
10 milliseconds and is much more
New submission from Alexandre Vassalotti:
Pickle fast mode is currently a deprecated feature of the Pickler class used to
disable its memoization mechanism. It was used mainly to create smaller output
by not emitting a PUT opcode for each object saved. Unfortunately, this mode
only worked
Alexandre Vassalotti added the comment:
Alright alright! Here's a less bogus patch. :)
--
stage: needs patch - patch review
Added file: http://bugs.python.org/file29846/fix_loads_appends.patch
___
Python tracker rep...@bugs.python.org
http
New submission from Alexandre Vassalotti:
In pickle.py, load_appends is currently defined as
def load_appends(self):
stack = self.stack
mark = self.marker()
list = stack[mark - 1]
list.extend(stack[mark + 1:])
del stack[mark:]
However, according
Alexandre Vassalotti added the comment:
In protocol 0, the persistent ID is restricted to alphanumeric strings because
of the problems that arise when the persistent ID contains newline characters.
_pickle likely should be changed to use the ASCII decoded. And perhaps, we
should check
Alexandre Vassalotti added the comment:
I also wrote a patch for this. I took I slightly different approach though. I
fixed the C implementation to be more strict on the quoting. Currently, it
strips trailing non-printable characters, something pickle.py doesn't do.
I also cleaned up
Alexandre Vassalotti added the comment:
I was targeting head, not the release branches. It is fine to change the
exception there as we don't make any guarantee about the exceptions raised
during the unpickling process. It is easy enough to fix patch use ValueError
for the release branch
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Removed file: http://bugs.python.org/file29824/fix_quoted_string_python3.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17710
Alexandre Vassalotti added the comment:
Here's a patch that fix the bug.
--
assignee: - alexandre.vassalotti
keywords: +patch
stage: needs patch - patch review
Added file: http://bugs.python.org/file29842/fix_bad_persid.patch
___
Python tracker rep
Alexandre Vassalotti added the comment:
The mutating __getstate__ is very likely the problem here. I've attached a
small test case which shows the described behavior.
We could fix this by always making a copy of any mutable container we want to
iterate over to save its items. Performance
Alexandre Vassalotti added the comment:
My point is I would prefer that we keep all optimizations to only the _pickle C
module and keep the Python implementation as simple as possible.
Also, I doubt the slight speedup shown by your microbenchmark will actually
result in any significant
Alexandre Vassalotti added the comment:
pickle.py is the buggy one here. Its use of the marshal module is really a
hack. Plus, it is slower than both struct and int.from_bytes.
14:40:57 [~/cpython]$ ./python -m timeit int.from_bytes(b'\xff\xff\xff\xff',
'big')
100 loops, best of 3: 0.209
Alexandre Vassalotti added the comment:
Some quick thoughts about the new implicit memoization scheme in Stefan's
implementation.
- The new scheme will need to be documented in PEP 3154 before we can accept
the change.
- I don't really like the idea of changing the semantics of the PUT
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Added file: http://bugs.python.org/file26881/pickle4-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
dependencies: +Unbinding of methods
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Removed file: http://bugs.python.org/file26881/pickle4-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
Alexandre Vassalotti added the comment:
Oops, wrong patch. Uploading the right one.
--
Added file: http://bugs.python.org/file26882/pickle4-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
New submission from Alexandre Vassalotti:
Stefan Mihaila has been working on the implementation of PEP 3154, plus some
other enhancements. His work is pretty complete and ready to be reviewed. I
will do my best to finish a thorough review of his changes by the end of next
week
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Added file: http://bugs.python.org/file26794/pickle4.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Removed file: http://bugs.python.org/file26794/pickle4.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
Changes by Alexandre Vassalotti alexan...@peadrop.com:
Added file: http://bugs.python.org/file26795/pickle4.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15642
Alexandre Vassalotti added the comment:
I get warnings when compiling with the patch:
/home/avassalotti/pickle4/Modules/_pickle.c: In function ‘save_global_binary’:
/home/avassalotti/pickle4/Modules/_pickle.c:2952: warning: pointer targets in
passing argument 2 of ‘_Pickler_Write’ differ
Alexandre Vassalotti added the comment:
There are reference leaks in the _pickle.c part that will need to be fixed too.
22:36:29 [~/pickle4]$ ./python -m test.regrtest -R :: test_pickle
[1/1] test_pickle
beginning 9 repetitions
123456789
.
test_pickle leaked [14780, 14780, 14780
Alexandre Vassalotti added the comment:
Amazing! Though, it would probably be good idea to benchmarks non-ASCII strings
as well.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15596
Alexandre Vassalotti added the comment:
I reviewed the patch here
http://bugs.python.org/review/15513/#ps5596
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15513
Alexandre Vassalotti alexan...@peadrop.com added the comment:
We will need to bump the protocol number to add support for None, Ellipsis, and
NotImplemented. Antoine, can you add this to PEP 3154?
--
___
Python tracker rep...@bugs.python.org
http
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Fixed. Thanks for the patch!
--
assignee: - alexandre.vassalotti
resolution: - fixed
stage: needs patch - committed/rejected
status: open - closed
___
Python tracker rep
Alexandre Vassalotti alexan...@peadrop.com added the comment:
sbt, the bug is not that the encoding is inefficient. The problem is we cannot
unpickle bytes streams from Python 3 using Python 2.
--
___
Python tracker rep...@bugs.python.org
http
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Brett, issue 2919 had a patch that merges profile/cProfile for a while now but
nobody test it yet.
All I need it someone to download the patch, install it, test it on some random
script and tell me if it works. I don't need more
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Antoine, do we have unit tests for this code path?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13503
Alexandre Vassalotti alexan...@peadrop.com added the comment:
This might not be the case anymore, but __module__ can sometime be None. There
is some discussion about this in Issue 3657. We should define the expected
behavior when this happens.
Also, I don't think backward-compatibility
Alexandre Vassalotti alexan...@peadrop.com added the comment:
I think we are kind of stuck here. I might need to rely on some clever hack to
generate the desired str object in 2.7 without breaking the bytes support in
3.3 and without changing 2.7 itself.
One *dirty* trick I am thinking about
Alexandre Vassalotti alexan...@peadrop.com added the comment:
I don't think it is a bug.
The posted code completely breaks the expected behavior of __getattribute__.
With a such implementation, there is nothing we can do with this object as we
cannot introspect it.
Use the following if you
Alexandre Vassalotti alexan...@peadrop.com added the comment:
We could resort to the text-based protocol which doesn't have these limitations
with respect to object lengths (IIRC). Performance won't be amazing, but we
won't have to modify the current pickle protocol
Alexandre Vassalotti alexan...@peadrop.com added the comment:
The value of the instruction pointer depends on the byte-code. So it's not
portable either.
But, the bigger issue is the fact generator objects do not have names we can
refer to, unlike top-level functions and classes which pickle
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Although, I would really like to see support of pickling generators. It is not
really possible in CPython. This is recurrent request. I even wrote a small
article about it.
http://peadrop.com/blog/2009/12/29/why-you-cannot-pickle
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Sorry Antoine, I have been busy with school work lately.
I like the general idea and I will try to look at your patch ASAP.
--
___
Python tracker rep...@bugs.python.org
http
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Didn't Victor say that only one seek at the end is necessary per
pickle? If this is the case, I don't think expensive seeks will be an
issue.
--
___
Python tracker rep...@bugs.python.org
New submission from Alexandre Vassalotti alexan...@peadrop.com:
This was mentioned during the review of issue #9410
(http://codereview.appspot.com/1694050/diff/2001/3001#newcode347), however we
forgot to fix this.
The new array-based memo for the Unpickler class assumes incorrectly that memo
Changes by Alexandre Vassalotti alexan...@peadrop.com:
--
nosy: +pitrou
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9965
___
___
Python-bugs
Alexandre Vassalotti alexan...@peadrop.com added the comment:
I was going to say this method
http://docs.python.org/dev/py3k/library/pickle.html#restricting-globals could
be used to prevent this kind of attack on bytearray. But, I came up with this
fun thing:
pickle.loads(b'\x80
Alexandre Vassalotti alexan...@peadrop.com added the comment:
I get this error with the patch:
python: /home/alex/src/python.org/py3k/Modules/_pickle.c:908:
_Unpickler_ReadFromFile: Assertion `self-next_read_idx == 0' failed.
Aborted
--
___
Python
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Antoine, I fixed these issues in the latest patch posted on Rietveld. Also,
Skip added the buffer limit in Unladen Swallow (see msg112956). We just need to
merge that.
--
Added file: http://bugs.python.org/file18777
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Patch committed in r83740. I will make the documentation update in a separate
commit.
Thanks!
--
resolution: - accepted
stage: patch review - committed/rejected
status: open - closed
versions: -Python 2.6, Python 2.7
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Documentation added in r83741.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5077
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Although it is tempting to fix this, it feels to much like a feature. I am
closing as won't fix because we don't add new features to 2.x.
--
resolution: - wont fix
stage: needs patch - committed/rejected
status: open
Alexandre Vassalotti alexan...@peadrop.com added the comment:
It is too late for 2.6.6 now that it is released.
--
resolution: - wont fix
stage: - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http
Alexandre Vassalotti alexan...@peadrop.com added the comment:
It is too late now for the 2.x version. And, the huge patch in issue 9410
includes an updated version of this patch for 3.x.
--
resolution: - duplicate
stage: - committed/rejected
status: open - closed
superseder: - Add
Alexandre Vassalotti alexan...@peadrop.com added the comment:
OK I am convinced, the current behavior is fine. Let's close this one.
--
resolution: - wont fix
stage: needs patch - committed/rejected
status: open - closed
___
Python tracker rep
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Ah, that's right Skip. You did fixed it in Unladen Swallow's trunk. I will take
a look at your solution.
http://code.google.com/p/unladen-swallow/source/diff?spec=svn1167r=1038format=sidepath=/trunk/Modules/cPickle.cold_path=/trunk
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Too late to make this change in 2.x. And the patch in issue 9410 includes the
optimization for 3.x.
--
resolution: - duplicate
stage: patch review - committed/rejected
status: open - closed
superseder: - Add Unladen
Alexandre Vassalotti alexan...@peadrop.com added the comment:
The security issue mentioned previously has been known for years. And, it is
easy to protect against. See
http://docs.python.org/py3k/library/pickle.html#restricting-globals
Also I am against adding pickling support to code objects
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Victor, have you tried using peek() instead of seek()? I mentioned this
previously in msg85780.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue3873
Alexandre Vassalotti alexan...@peadrop.com added the comment:
The patch look great. Can you update your patch to make it work against the
version of 2to3 in the sandbox (http://svn.python.org/sandbox/trunk/2to3)?
--
___
Python tracker rep
Alexandre Vassalotti alexan...@peadrop.com added the comment:
I have fixed some style nits in your patch. It would be nice to have tests for
the different control paths in instantiate(). But, I realize that would be a
bit annoying.
Apart from that, the patch looks good.
--
Added file
Alexandre Vassalotti alexan...@peadrop.com added the comment:
LGTM
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9268
___
___
Python-bugs-list
Alexandre Vassalotti alexan...@peadrop.com added the comment:
The args argument is always a tuple created with Pdata_poptuple(). You can add
an explicit type check. If this check fails a RuntimeError should be raised,
because this would indicate a programming error in pickle
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Now, that 2.7 is out we won't able to commit this anymore. It is sad to abandon
a good patch like this.
--
resolution: - wont fix
stage: patch review - committed/rejected
status: open - closed
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Closing this since Python 2.7 is out now.
--
nosy: +alexandre.vassalotti
resolution: - wont fix
stage: needs patch - committed/rejected
status: open - closed
___
Python tracker rep
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Actually, you are right. This could be added as bug fix.
--
resolution: wont fix -
stage: committed/rejected - needs patch
status: closed - open
___
Python tracker rep...@bugs.python.org
Alexandre Vassalotti alexan...@peadrop.com added the comment:
This begs a question, however: why not use regular python bytecode in pickles?
Unlike pickle protocols, the bytecode is not required to be compatible across
Python versions. Furthermore, Python bytecode is designed has a general
Alexandre Vassalotti alexan...@peadrop.com added the comment:
It works for pickle/_pickle and heapq/_heapq, but won't work for io/_io/_pyio.
You can make the dictionary values as lists for the 'blocked' argument for
import_fresh_module(). That would work.
And, can you add documentation
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Good idea!
You can remove the doctest behavior. I don't think it is useful. But if you
remove it, make sure you add an usage message when no argument is given.
--
___
Python tracker rep
Alexandre Vassalotti alexan...@peadrop.com added the comment:
The code for import_module_implementations seems a bit fragile. Prefixing the
module name with an underscore might not always yield the name of the optimized
implementation. An explicit dictionary to map the Python and C
Alexandre Vassalotti alexan...@peadrop.com added the comment:
It's a documentation enhancement request. The updated documentation in Python 3
for pickle added most of the requested details.
--
resolution: - fixed
status: open - closed
___
Python
Alexandre Vassalotti alexan...@peadrop.com added the comment:
The new documentation for pickle in Python 3 fixes this.
I still believe we should outline the differences between pickle.py and
cPickle. But at this point, it's unlikely I will put the time to do it. Feel
free to open a new issue
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Thank you for the nice investigative work!
I will try my best to review this patch by next week.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5180
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Do I understand correctly that the issue is that python
Pickler class has dispatch attribute but C Pickler does
not?
Yes.
The add_dispatch_check-0.patch patch does not seem
to add class attribute, it adds an instance attribute
Alexandre Vassalotti alexan...@peadrop.com added the comment:
One easy fix for this would be to make InteractiveConsole use the string
__main__ instead of __console__. But other than that, I don't think we can
fix this within pickle.
--
___
Python
Alexandre Vassalotti alexan...@peadrop.com added the comment:
How about checking the preconditions before calling asctime()? If the check
fails, then we can raise an exception without crashing.
--
___
Python tracker rep...@bugs.python.org
http
Alexandre Vassalotti alexan...@peadrop.com added the comment:
Committed in r80749 and r80751 (for py3k).
Thank you!
--
resolution: - accepted
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
Alexandre Vassalotti alexan...@peadrop.com added the comment:
I found the issue. The view types didn't have Py_TPFLAGS_CHECKTYPES set, so the
types were using the old-style binary operators.
Here's a patch that fixes the issue. Please review.
--
Added file: http://bugs.python.org
101 - 200 of 579 matches
Mail list logo