Hi Sebastian,



Thanks for the response, but actually joblib doesn't work either:





In [1]: from sklearn.externals import joblib




In [2]: rf = joblib.load('rf-1.joblib')

---------------------------------------------------------------------------

error                                     Traceback (most recent call last)

<ipython-input-3-2c47f0ec1d5b> in <module>()

----> 1 rf = joblib.load('rf-1.joblib')




/Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
 in load(filename, mmap_mode)

    417                               'ignoring mmap_mode "%(mmap_mode)s" flag 
passed'

    418                               % locals(), Warning, stacklevel=2)

--> 419             unpickler = ZipNumpyUnpickler(filename, 
file_handle=file_handle)

    420         else:

    421             unpickler = NumpyUnpickler(filename, 
file_handle=file_handle,




/Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
 in __init__(self, filename, file_handle)

    306         NumpyUnpickler.__init__(self, filename,

    307                                 file_handle,

--> 308                                 mmap_mode=None)

    309

    310     def _open_pickle(self, file_handle):




/Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
 in __init__(self, filename, file_handle, mmap_mode)

    264         self._dirname = os.path.dirname(filename)

    265         self.mmap_mode = mmap_mode

--> 266         self.file_handle = self._open_pickle(file_handle)

    267         Unpickler.__init__(self, self.file_handle)

    268         try:




/Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
 in _open_pickle(self, file_handle)

    309

    310     def _open_pickle(self, file_handle):

--> 311         return BytesIO(read_zfile(file_handle))

    312

    313




/Users/nuneziglesiasj/anaconda/envs/py3k-gala/lib/python3.3/site-packages/sklearn/externals/joblib/numpy_pickle.py
 in read_zfile(file_handle)

     66     # We use the known length of the data to tell Zlib the size of the

     67     # buffer to allocate.

---> 68     data = zlib.decompress(file_handle.read(), 15, length)

     69     assert len(data) == length, (

     70         "Incorrect data length while decompressing %s."




error: Error -3 while decompressing data: incorrect header check







The very same commands work fine in Py2:





In [1]: from sklearn.externals import joblib




In [2]: rf1 = joblib.load('rf-1.joblib')




In [3]:







Is this unexpected?

On Fri, Jan 23, 2015 at 1:57 AM, Sebastian Raschka <se.rasc...@gmail.com>
wrote:

> Hi, Juan,
> It's been some time, but I  remember that I had similar issues. I think it 
> has to do with the numpy arrays that specifically cause problems in pickle. 
> (http://bugs.python.org/issue6784)
> You could try to use joblib (which should also be more efficient):
>>>> from sklearn.externals import joblib
>>>> joblib.dump(clf, 'filename.pkl')
>>>> clf = joblib.load('filename.pkl') 
> (http://scikit-learn.org/stable/modules/model_persistence.html)       
>  
> Best,
> Sebastian
>> On Jan 22, 2015, at 8:50 AM, jni.s...@gmail.com wrote:
>> 
>> Hi all,
>> 
>> I'm working on a project that depends on sklearn. I've been up test coverage 
>> (which includes saving a RandomForest, so far using joblib serialization), 
>> and now I wanted to make the project Python 3-compatible. However, the final 
>> roadblock is the sharing of RF objects: I can't load the Python 2-serialized 
>> RFs with Python 3 tests. Of course, the test outcome depends on the exact RF 
>> that was created a while back. Is there any way around this?
>> 
>> Thanks!
>> 
>> Juan.
>> 
>> 
>> ------------------------------------------------------------------------------
>> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
>> GigeNET is offering a free month of service with a new server in Ashburn.
>> Choose from 2 high performing configs, both with 100TB of bandwidth.
>> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
>> http://p.sf.net/sfu/gigenet_______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> ------------------------------------------------------------------------------
> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
> GigeNET is offering a free month of service with a new server in Ashburn.
> Choose from 2 high performing configs, both with 100TB of bandwidth.
> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
> http://p.sf.net/sfu/gigenet
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to