[issue12596] cPickle - stored data differ for same dictionary

2013-05-02 Thread Alexandre Vassalotti

Alexandre Vassalotti added the comment:

There is no guarantee the binary representation of pickled data will be same 
between different runs. We try to make it mostly consistent when we can, but 
there are cases, like this one, where we cannot ensure consistency without 
hurting performance significantly.

--
nosy: +alexandre.vassalotti
resolution:  - works for me
stage: needs patch - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-17 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a minimal reproducer. Results:

pickle.dumps('spam', 2)
0: \x80 PROTO  2
2: USHORT_BINSTRING 'spam'
8: qBINPUT 0
   10: .STOP
highest protocol among opcodes = 2

pickle.dumps('spam1'[:-1], 2)
0: \x80 PROTO  2
2: USHORT_BINSTRING 'spam'
8: qBINPUT 0
   10: .STOP
highest protocol among opcodes = 2

cPickle.dumps('spam', 2)
0: \x80 PROTO  2
2: USHORT_BINSTRING 'spam'
8: qBINPUT 1
   10: .STOP
highest protocol among opcodes = 2

cPickle.dumps('spam1'[:-1], 2)
0: \x80 PROTO  2
2: USHORT_BINSTRING 'spam'
8: .STOP
highest protocol among opcodes = 2

The difference between 3rd and 4th examples is BINPUT 1. In the last case the 
string has refcount=1 and BINPUT doesn't emitted due to optimization. Note that 
Python implementation emits BINPUT with different number.

--
Added file: http://bugs.python.org/file29108/cPickletest2.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-07 Thread Antoine Pitrou

Antoine Pitrou added the comment:

As soon as hash randomization is turned on (and it's the default starting with 
Python 3.3), the pickled representation of dicts will also vary from run to run:

$ python -R -c import pickle; print pickle.dumps({'a':1, 'b':2}) |md5sum
c0ae6b7f62b9c0839be883dd1efee84e  -
$ python -R -c import pickle; print pickle.dumps({'a':1, 'b':2}) |md5sum
b03bf608516f3e0244a96d740139b050  -

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It is surprising that the pickled representation of 1-element dict varies from 
run to run.

--
components: +Extension Modules -None

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-07 Thread Ramchandra Apte

Ramchandra Apte added the comment:

Try `./python -R -c import pickle; print(pickle.dumps({'a':1, 'v':1})) 
|md5sum`. The output will differ on subsequent run, while trying `./python -R 
-c import pickle; print(pickle.dumps({'a':1})) |md5sum`, the output is always 
the same. I suspect because the order of dicts are different on every run (try 
repr).

--
nosy: +ramchandra.apte
versions: +Python 3.3, Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-07 Thread Ramchandra Apte

Ramchandra Apte added the comment:

Darn, last sentence has some mistakes.
I suspect this issue is happening because the order of a dictionary is 
different on every run (try repr).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-07 Thread Ramchandra Apte

Ramchandra Apte added the comment:

Further proof:
here are the results of two invocations of `./python -R -c import pickle; 
print(pickle.dumps({'a':1, 'v':1}))`

b'\x80\x03}q\x00(X\x01\x00\x00\x00vq\x01K\x01X\x01\x00\x00\x00aq\x02K\x01u.'
b'\x80\x03}q\x00(X\x01\x00\x00\x00aq\x01K\x01X\x01\x00\x00\x00vq\x02K\x01u.'
Notice that in the second pickled data, the pickled data for 'v' has exchanged 
places with the one for 'a'! ('v' has become 'a' and at the second-last 
character 'a' has become 'v')

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It is most probable that the difference is caused by the string interning.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2013-02-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
nosy: +serhiy.storchaka
stage:  - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2011-07-20 Thread Philipp Mölders

New submission from Philipp Mölders philipp.moeld...@googlemail.com:

I think there is a problem within cPickle. I wanted to store a dictionary with 
only one entry with cPickle.dump() this works fine and can be loaded with 
cPickle.load(). But if you store the loaded data with cPickle.dump() again, the 
stored data differ from the first stored data. But the load works fine only the 
written data on disk differ. I've written a sample script, that shows the 
problem within code. 
This problem occurs only in the 2.7 version of Python and only with 
dictionaries with one entry.

--
components: None
files: cPickletest.py
messages: 140750
nosy: Philipp.Mölders
priority: normal
severity: normal
status: open
title: cPickle - stored data differ for same dictionary
versions: Python 2.7
Added file: http://bugs.python.org/file22706/cPickletest.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2011-07-20 Thread Philipp Mölders

Changes by Philipp Mölders philipp.moeld...@googlemail.com:


--
type:  - behavior

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2011-07-20 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

If the load produces the same result, why does it matter that what is on disk 
differs?

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12596] cPickle - stored data differ for same dictionary

2011-07-20 Thread Philipp Mölders

Philipp Mölders philipp.moeld...@googlemail.com added the comment:

The file on disk matters for a replication service, so if a file is touched but 
not changed it will not be replicated, but in this special case the data change 
even when the structures have not changed. So if this happens very often it 
could cause a lot of replication which is not needed because nothing changed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12596
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com