Bugs item #1724366, was opened at 2007-05-23 18:42 Message generated for change (Comment added) made by gjb1002 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1724366&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Geoffrey Bache (gjb1002) Assigned to: Nobody/Anonymous (nobody) Summary: cPickle module doesn't work with universal line endings Initial Comment: On UNIX, I cannot read pickle files created on Windows using the cPickle module, even if I open the file with universal line endings. It works fine with the pickle module but is of course slower (and I have to read lots of them) I attach a test case that pickles and unpickles an smptlib.SMTP object, converting the file to DOS format in between. There is nothing special about SMTP, you can use any object at all in a different module. On my system (RHEL4 with Python 2.4.3) I get the following output: portmoller : pickletest.py cPickle unix2dos: converting file dump to DOS format ... Traceback (most recent call last): File "pickletest.py", line 14, in ? print load(readFile) ImportError: No module named smtplib portmoller : pickletest.py pickle unix2dos: converting file dump to DOS format ... <smtplib.SMTP instance at 0xb7ea350c> ---------------------------------------------------------------------- >Comment By: Geoffrey Bache (gjb1002) Date: 2007-05-25 19:24 Message: Logged In: YES user_id=769182 Originator: YES Yes, I'm sure Python is trying to import "smtplib\r". For various reasons I need to use protocol 0: not least because I use the pickle files as test data and it's much easier to administer a load of text files than a load of binary files. I will experiment with reading the files in binary mode on Monday and get back to you. My current workaround is to do loads(file.read()) instead of load(file) which I guess is a performance penalty. Any idea whether this is likely to be slower than just using the pickle module? (I haven't tested this) ---------------------------------------------------------------------- Comment By: Gabriel Genellina (gagenellina) Date: 2007-05-25 12:29 Message: Logged In: YES user_id=479790 Originator: NO The culprit is cPickle.c; it takes certain shortcuts for read() and readline() depending on which type of file you pass in. For a true file object, it uses its own implementation for those two methods, ignoring the file mode. But it appears that there is NO WAY universal line endings could work if the pickle contains any unicode object. The pickle format for Unicode quotes any \n but *not* \r so the unpickler cannot determine, when it sees a "\r", if it is a MAC end-of-line or an embedded "\r". So, the only safe end-of-line character for a pickle using protocol 0 is "\n", and that means that the file must be written in binary mode. (This may also indicate that you cannot read unicode objects with embedded \r in a MAC using protocol 0, but I don't have a MAC to test it). So, until this is fixed (either the module or the documentation), one should forget about universal line endings and write all pickle files as binary. (This way ALL lines end in \n and it should work fine on all platforms) ---------------------------------------------------------------------- Comment By: Gabriel Genellina (gagenellina) Date: 2007-05-25 11:04 Message: Logged In: YES user_id=479790 Originator: NO I don't see any "Attach" button... Just add these lines near the top of the test script: original__import = __import__ def myimport(name, *args): print "import",repr(name) return original__import(name,*args) #end myimport __builtins__.__import__ = myimport ---------------------------------------------------------------------- Comment By: Gabriel Genellina (gagenellina) Date: 2007-05-25 11:00 Message: Logged In: YES user_id=479790 Originator: NO Please try again with this modified version. I think you will see that Python is trying to import "smtplib\r" On Windows, trying to read a pickle file with MAC line endings gives a different error: cPickle.UnpicklingError: pickle data was truncated It seems that cPickle support for protocol 0 is broken. If you can, try to use the higher, binary, protocols, they don't have this problem. Even if you must use protocol 0, opening the file always in binary mode should not have this problem. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1724366&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com