Olivier Grisel <olivier.grisel@...> writes: > You can use the Pipeline class to build a compound classifier that > binds a text feature extractor with a classifier to get a text > document classifier in the end. > Done!
> > 7s is very long. How long is your text document in bytes ? The text documents are around 50kB. > Maybe you > could Only consider the first kilobytes of the documents and ignore > the remaining text as testing time (while use the complete documents > at training time). > Er, I think I am missing something here, if I consider only first few kilobytes wouldnt that mean that I loose the features in the rest of the document which in turn might lead to false match. > You should also probably profile your script to understand what's > taking so long. For instance you can use: > > http://www.vrplumber.com/programming/runsnakerun/ > Excellent, thanks... ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
