On 12 August 2012 15:35, Zach Bastick <[email protected]> wrote:
> I have tried various machine learning algorithms from scikit learn but
> can't find a good prediction model.
> The features I'm using are the tf-idf of set of text documents,
> correlated with human ratings assigned to each document. I'm thinking
> that I must be doing something wrong as the scores can't be that bad
> (not to mention negative?)
>
> If someone could have a look at it, I'd really appreciate it. I didn't
> upload to a github gist because they won't let me upload the dataset
> directory. So I've uploaded my really short code (regression.py) AND the
> original data set (/texts) here (625K):
> https://dl.dropbox.com/u/74279156/regression.zip
>
> This is my output:
> C:\python code\program>python regression.py
> loading texts...
> n_samples: 53, n_features: 6284
>
> LinearRegresson
> [ 0.34662496 0.23446674 0.30332109 0.3163838 0.01607913]
> Accuracy: 0.24 (+/- 0.06)
>
> SVR linear
> [-0.05521329 -1.61280714 -0.67428098 -0.8805647 -2.20730703]
> Accuracy: -1.09 (+/- 0.37)
>
> SVR poly 4 degrees
> [-0.18814233 -1.78480475 -0.88158686 -1.05944432 -2.40284073]
> Accuracy: -1.26 (+/- 0.38)
>
> SVR sigmoid
> [-0.18814233 -1.78480475 -0.88158686 -1.05944432 -2.40284073]
> Accuracy: -1.26 (+/- 0.38)
>
>
> Please tell me what's wrong.. I'm dying to know how to get scikit-lean
> to predict based on this dataset.
>
> Thanks
>
> Zach
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
Where can I get the plugins from?
--
Public key at: http://pgp.mit.edu/ Search for this email address and select
the key from "2011-08-19" (key id: 54BA8735)
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general