Hey guys! I am currently trying to do multilabel prediction using textual features (e.g., tfidf).
My data consists of a different amount of labels for a sample. One can have just one label and one can have 10 labels. I now simply built a list of tuples for my y vector. So for example: (19, 8, 7, 5) (8, 22, 23, 6, 18, 3) (22,) ... I have decided as first step to use LinearSVC. When I train the classifier with about 10.000 samples all works fine and also the prediction output looks fine. But as soon as I use all my samples (~300.000) my python.exe crashes in Windows. So I tried it on my Linux server, and I get a segfault error. Does anyone know how this can happen? Am I probably doing something wrong? I have some more questions regarding multilabel classification, but let's stick to this first ;) Many Regards, Philipp ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general