On 01/23/2012 02:46 PM, Lars Buitinck wrote: > 2012/1/23 Dimitrios Pritsos<[email protected]>: >> I will give it a try however in some of my tests had a memory management >> problem. As I can recall it was mostly because of numpy function that >> might ask from pyTable to load every thing in main men. I guess some >> loops and some slicing might solve the problem. > No experience with PyTables, sorry. > >> However I fist try to figure out how to use linear_model.SGDClassifier >> which it suppose to be capable to be trained in stages. Plus since I am >> using Linear Kernel it won't effect my results. > Is that an SVC(kernel="linear") or a LinearSVC? The latter should be > able to handle a 50k samples array if the number of features is kept > within some bound (a few 100k should certainly be fine). > Are you sure About that? Because I ran both and they seem to behave almost the same in the memory handling. I mean both not no able to cope with 33k samples x 30k features because of main memory issues.
Note that I am directly gining the EArray as arguments which it is loading on the mem only the slice is ask for. Therefore I can conclude that sklearn.svm Fit() functions are not do it the the fiiting procees using some short of iteration and they 'ask' for the whole Training array at once. However I am not having any insight about the implementation and the Link to LibLinear. However LibLinear has an implementation for doing what I ask (ie divide the training set to segments and Fit the model segment by segment), but is not a seamless solution for me in addition it requires to transform the data to an bin file. In any case this will be my last option. ------------------------------------------------------------------------------ Try before you buy = See our experts in action! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-dev2 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
