Re: [Scikit-learn-general] Sharing objects between Python 2 and 3

Juan Nunez-Iglesias Thu, 22 Jan 2015 18:31:56 -0800

Nope, the Py2 RF was saved with joblib!




The SO response might work for standard pickling though, I'll give that a try, 
thanks!

On Fri, Jan 23, 2015 at 11:18 AM, Sebastian Raschka <se.rasc...@gmail.com>
wrote:

> Sorry, I think my previous message was a little bit ambiguous.
> What I would try is:
> 1) Unpickle the original pickle file in Python 2
> 2) Pickle it via joblib
> 3) Load it in Python 3
> (I think you only did step 3), right? Sorry for the confusion).
> I also just saw a related SO post that might be very helpful: 
> http://stackoverflow.com/questions/11305790/pickle-incompatability-of-numpy-arrays-between-python-2-and-3
>  
> <http://stackoverflow.com/questions/11305790/pickle-incompatability-of-numpy-arrays-between-python-2-and-3>
> Best,
> Sebastian
>> On Jan 22, 2015, at 5:10 PM, jni.s...@gmail.com wrote:
>> 
>> Hi Sebastian,
>> 
>> Thanks for the response, but actually joblib doesn't work either:
>> 
>> In [1]: from sklearn.externals import joblib
>> 
>> In [2]: rf = joblib.load('rf-1.joblib')
>> ---------------------------------------------------------------------------
>> error                                     Traceback (most recent call last)
>> <ipython-input-3-2c47f0ec1d5b> in <module>()
>> ----> 1 rf = joblib.load('rf-1.joblib')
>> 
>> /Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
>>  in load(filename, mmap_mode)
>>     417                               'ignoring mmap_mode "%(mmap_mode)s" 
>> flag passed'
>>     418                               % locals(), Warning, stacklevel=2)
>> --> 419             unpickler = ZipNumpyUnpickler(filename, 
>> file_handle=file_handle)
>>     420         else:
>>     421             unpickler = NumpyUnpickler(filename, 
>> file_handle=file_handle,
>> 
>> /Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
>>  in __init__(self, filename, file_handle)
>>     306         NumpyUnpickler.__init__(self, filename,
>>     307                                 file_handle,
>> --> 308                                 mmap_mode=None)
>>     309
>>     310     def _open_pickle(self, file_handle):
>> 
>> /Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
>>  in __init__(self, filename, file_handle, mmap_mode)
>>     264         self._dirname = os.path.dirname(filename)
>>     265         self.mmap_mode = mmap_mode
>> --> 266         self.file_handle = self._open_pickle(file_handle)
>>     267         Unpickler.__init__(self, self.file_handle)
>>     268         try:
>> 
>> /Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
>>  in _open_pickle(self, file_handle)
>>     309
>>     310     def _open_pickle(self, file_handle):
>> --> 311         return BytesIO(read_zfile(file_handle))
>>     312
>>     313
>> 
>> /Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
>>  in read_zfile(file_handle)
>>      66     # We use the known length of the data to tell Zlib the size of 
>> the
>>      67     # buffer to allocate.
>> ---> 68     data = zlib.decompress(file_handle.read(), 15, length)
>>      69     assert len(data) == length, (
>>      70         "Incorrect data length while decompressing %s."
>> 
>> error: Error -3 while decompressing data: incorrect header check
>> 
>> 
>> The very same commands work fine in Py2:
>> 
>> In [1]: from sklearn.externals import joblib
>> 
>> In [2]: rf1 = joblib.load('rf-1.joblib')
>> 
>> In [3]:
>> 
>> 
>> Is this unexpected?
>> 
>> 
>> 
>> 
>> On Fri, Jan 23, 2015 at 1:57 AM, Sebastian Raschka <se.rasc...@gmail.com 
>> <mailto:se.rasc...@gmail.com>> wrote:
>> 
>> Hi, Juan, 
>> 
>> It's been some time, but I remember that I had similar issues. I think it 
>> has to do with the numpy arrays that specifically cause problems in pickle. 
>> (http://bugs.python.org/issue6784) 
>> 
>> You could try to use joblib (which should also be more efficient): 
>> 
>> >>> from sklearn.externals import joblib 
>> >>> joblib.dump(clf, 'filename.pkl') 
>> >>> clf = joblib.load('filename.pkl') 
>> 
>> (http://scikit-learn.org/stable/modules/model_persistence.html)      
>> 
>> 
>> Best, 
>> Sebastian 
>> 
>> > On Jan 22, 2015, at 8:50 AM, jni.s...@gmail.com wrote: 
>> > 
>> > Hi all, 
>> > 
>> > I'm working on a project that depends on sklearn. I've been up test 
>> > coverage (which includes saving a RandomForest, so far using joblib 
>> > serialization), and now I wanted to make the project Python 3-compatible. 
>> > However, the final roadblock is the sharing of RF objects: I can't load 
>> > the Python 2-serialized RFs with Python 3 tests. Of course, the test 
>> > outcome depends on the exact RF that was created a while back. Is there 
>> > any way around this? 
>> > 
>> > Thanks! 
>> > 
>> > Juan. 
>> > 
>> > 
>> > ------------------------------------------------------------------------------
>> >  
>> > New Year. New Location. New Benefits. New Data Center in Ashburn, VA. 
>> > GigeNET is offering a free month of service with a new server in Ashburn. 
>> > Choose from 2 high performing configs, both with 100TB of bandwidth. 
>> > Higher redundancy.Lower latency.Increased capacity.Completely compliant. 
>> > http://p.sf.net/sfu/gigenet_______________________________________________ 
>> > Scikit-learn-general mailing list 
>> > Scikit-learn-general@lists.sourceforge.net 
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
>> 
>> 
>> ------------------------------------------------------------------------------
>>  
>> New Year. New Location. New Benefits. New Data Center in Ashburn, VA. 
>> GigeNET is offering a free month of service with a new server in Ashburn. 
>> Choose from 2 high performing configs, both with 100TB of bandwidth. 
>> Higher redundancy.Lower latency.Increased capacity.Completely compliant. 
>> http://p.sf.net/sfu/gigenet 
>> _______________________________________________ 
>> Scikit-learn-general mailing list 
>> Scikit-learn-general@lists.sourceforge.net 
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
>> 
>> 
>> ------------------------------------------------------------------------------
>> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
>> GigeNET is offering a free month of service with a new server in Ashburn.
>> Choose from 2 high performing configs, both with 100TB of bandwidth.
>> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
>> http://p.sf.net/sfu/gigenet_______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Sharing objects between Python 2 and 3

Reply via email to