Hi, some quick thoughts:
- if you use a multinomial Naive Bayes classifier (aka a language model) you can fit a background model on the large dataset and use that to smooth the model fitted on the smaller dataset. - you should look at the domain adaptation / multi-task learning literature - this might fit your setting better than traditional semi-supervised learning. best, Peter 2012/7/9 Gael Varoquaux <[email protected]>: > Hi, > > You can try setting this as a semi-supervised learning problem and using > label propagation: > > http://scikit-learn.org/stable/modules/label_propagation.html#label-propagation > > HTH, > > G > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general -- Peter Prettenhofer ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
