Hi Peter,

number of samples: 1 million tweets
number of features: I use the bag of words model, in-fact I have followed this 
example  
http://scikit-learn.github.com/scikit-learn-tutorial/working_with_text_data.html.
 It uses TF-IDF normalization.
class distribution: equal number of positive and negative tweets
features: I removed the stop words, punctuations, URLs and user names, 

Adnan

________________________________
 From: Peter Prettenhofer <[email protected]>
To: [email protected] 
Sent: Thursday, February 2, 2012 2:20 PM
Subject: Re: [Scikit-learn-general] Improving the accuracy of classifier
 
Hi Adnan,

can you give use some more specific information about your learning
task / dataset including:

- number of samples

- number of features

- class distribution

- features (normalization, preprocessing)

best,
Peter

2012/2/2 adnan rajper <[email protected]>:
> hi everybody,
>
> I am using multinomial and LinearSVC classifier with default parameters to
> classify twitter messages into two classes (positive or negative). I
> followed the tutorial
> on http://scikit-learn.github.com/scikit-learn-tutorial/working_with_text_data.html.
> I tried "parameter tuning using grid search",  but it gets too slow. Both
> classifiers (multinomial and LinearSVC) give 75% accuracy. My problem is
> that I want to improve the accuracy, for instance I want to make it more
> than 80%. Is there anyway to do it through scikit.
>
>
> thanks
> Adnan
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to