Hello,
    I had been trying to dump a compressed joblib file (which was working fine
 about a month ago). Previously I had an issue with amount of memory that
 joblib compression took and it seemed that zlib was the issue. But I got more
 memory to satisfy the problem.
    However when I tried it to do the same today I get an error on
 decompressing. Is this an seen issue with joblib? (I haven't changed my code
 and was on vacation for a month). I upgraded the scikit to 0.13 and still see
 the issue. Following basically demonstrates the steps in my code: loading an
 uncompressed classifier object dumped with joblib, compressing it and dumping
 the new compressed classifier.

$ ipython
Python 2.6.6 (r266:84292, Sep 11 2012, 08:34:23) 
Type "copyright", "credits" or "license" for more information.

IPython 0.13 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from sklearn.externals import joblib

In [2]: clf=joblib.load("classifier.joblib") #Load uncompressed classifier

In [3]: clf
Out[3]: 
SGDClassifier(alpha=1e-05, class_weight=None, epsilon=0.1, eta0=0.0,
       fit_intercept=True, l1_ratio=0.15, learning_rate='optimal',
       loss='log', n_iter=35, n_jobs=1, penalty='l2', power_t=0.5,
       random_state=None, rho=None, shuffle=False, verbose=0,
       warm_start=False)

In [4]: joblib.dump(clf, "compressedclassifier.joblib", compress=9)
Out[4]: ['compressedclassifier.joblib', 'compressedclassifier.joblib_01.npy.z']

In [5]: clf=joblib.load("compressedclassifier.joblib")
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-ad6b23335871> in <module>()
----> 1 clf=joblib.load("compressedclassifier.joblib")

/home/n7/newenv/lib/python2.6/site-
packages/sklearn/externals/joblib/numpy_pickle.pyc in load(filename, mmap_mode)
    422 
    423     try:
--> 424         obj = unpickler.load()
    425     finally:
    426         if hasattr(unpickler, 'file_handle'):

/usr/lib64/python2.6/pickle.pyc in load(self)
    856             while 1:
    857                 key = read(1)
--> 858                 dispatch[key](self)
    859         except _Stop, stopinst:
    860             return stopinst.value

/home/n7/newenv/lib/python2.6/site-
packages/sklearn/externals/joblib/numpy_pickle.pyc in load_build(self)
    291                         "but numpy didn't import correctly")
    292             nd_array_wrapper = self.stack.pop()
--> 293             array = nd_array_wrapper.read(self)
    294             self.stack.append(array)
    295 

/home/n7/newenv/lib/python2.6/site-
packages/sklearn/externals/joblib/numpy_pickle.pyc in read(self, unpickler)
    157         filename = os.path.join(unpickler._dirname, self.filename)
    158         array = 
unpickler.np.core.multiarray._reconstruct(*self.init_args)
--> 159         data = read_zfile(open(filename, 'rb'))
    160         state = self.state + (data,)
    161         array.__setstate__(state)

/home/n7/newenv/lib/python2.6/site-
packages/sklearn/externals/joblib/numpy_pickle.pyc in read_zfile(file_handle)
     69     assert len(data) == length, (
     70         "Incorrect data length while decompressing %s."
---> 71         "The file could be corrupted." % file_handle)
     72     return data
     73 

AssertionError: Incorrect data length while decompressing <open file 
'compressedclassifier.joblib_01.npy.z', mode 'rb' at 0x2d5b5d0>.The file could 
be corrupted.

In [6]:


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to