thanks Sebastian for the reply.
I am having a training dataset having negative and positive class in 1:1
ratio. Yes I already perform grid searching for the best parameter (for
SVM) and selected the best value for n_estimator based on accuracy (for
Random forest).
Even I tried with all the combinations of features, but I was unable to
improve 1% accuracy at the max. I saw the accuracy as:
i) using stratified cross validation: accuracy 96%
ii) Train-test split using shuffle stratified k-fold cross validation (10
iterations): accuracy 94%
iii) validation on an independent test data with selected features: 89%
What is your opinion about it?
thanks!
Shalu
On Thu, Feb 26, 2015 at 5:07 PM, Sebastian Raschka <se.rasc...@gmail.com>
wrote:
> i) I think in practice, this scenario is highly unlikely (floating
> points), but I am pretty sure it would be the class with the lower integer
> index (due to argmax).
> ii) general question: is one class over- or underrepresented? I assume you
> already did some grid searching and it's the best you could get? Maybe try
> a different classifier or over-/undersampling techniqyes
> iii) Why, not, and I think the random forest classifier should be well
> calibrated, too.
> iv) Use k-fold cross validation. (
> http://scikit-learn.org/stable/modules/cross_validation.html)
>
> Best,
> Sebastian
>
>
> On Feb 26, 2015, at 8:00 AM, shalu jhanwar <shalu.jhanwa...@gmail.com>
> wrote:
>
> Hey guys,
>
> Would you like to comment on them according to your exp.?
>
> i) if both the classes are having same probability (0.5), then which class
> would be predicted by Random Forest?
> ii) In my classification, I have seen more false predictions corresponding
> to the positive class by my model. Can you suggest how can I improve the
> accuracy of the model by tuning any of the parameter or any suggestion?
> iii) Can I use these probability values to rank my predictions (strong and
> weak predictions)?
> iv) How can I check if my model is overfit or not?
>
> Many thanks!
> Shalu
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general