Dear Olivier and Gael,

Thank you guys.

Olivier,

Do you mean each feature vector sum to 1, right?


Yes and their values start and end 0 and 1 respectively. These are
probability distribution actually.

You should never use the default settings of a classifier to compare
> scores. Always grid search the optimal values of the most impacting
> hyperparameters. In the case of LogisticRegression you should grid
> search the regularization parameter which is named 'C'.
> Here is the documentation for grid search:
>   http://scikit-learn.org/dev/modules/grid_search.html


When waiting the answers, I actually tried changing the regularization
parameter (C) and accuracy increased.

I tried grid search to choose best parameters for Logistic Regression and
scores:


                *  WORD *                *Scikit (before grid)*
   * Scikit
(after grid)  *                   *Weka*

   - accommodate             0.3                                 0.47
                                        0.667
   - bow                              0.05
   0.35                                            0.681818
   - display                         0.475
   0.525                                          0.70
   - haunt                            0.575
   0.78                                            0.53
   - owe                              0.2533                           0.55
                                              0.4375


Note that I haven't make any grid search for Weka yet. I am not familiar
with Weka but when I pick the right parameter(s) for Weka's Logistic, I
will share my results.

Even there is no parameter optimization for Weka, it looks significantly
better for these data. Is there something I missed?

My new code snippet here: http://pastebin.com/A6xPYVH1



On Thu, Jan 24, 2013 at 8:21 PM, Olivier Grisel <olivier.gri...@ensta.org>wrote:

> 2013/1/24 O. B. <thyme....@gmail.com>:
> > Sorry I forgot the mention:
> >
> > Scikit's Logistic Regression is incredibly fast compared to Weka. Weka's
> > implementation (mostly based on this paper) is slow as well as VERY
> memory
> > intensive. Sometimes it wasn't enough to allocate 3 GB as heap size. My
> > dataset (words in above have not more than 100 instance) is very small
> > because I use LR word by word.
> >
> > Is this the case because scikit's LR uses liblinear library?
> >
> > Thank you
> >
> > On Thu, Jan 24, 2013 at 5:25 PM, O. B. <thyme....@gmail.com> wrote:
> >>
> >> Hello all,
> >>
> >> I have some problem with my experiments. I used Logistic Regression (LR)
> >> to classify words senses. We have gold tags for (target set) each word
> >> instance.
> >>
> >> I did 10 fold cross validation. Some words in my dataset have more than
> >> two senses so I wrapped logistic regression with OneVsRestClassifier.
>
> You don't need to wrap LogisticRegression in a OneVsRestClassifier
> object as it's already using OvR / OvA for handling multiclass
> internally as explained in the doc:
>
> http://scikit-learn.org/dev/modules/multiclass.html
>
> > The
> >> code is here. Accuracy was not impressive and so I suspect if there was
> an
> >> error in my code.  So I picked five words to classify using LR on Weka.
> I
> >> used default settings on Weka
>
> You should never use the default settings of a classifier to compare
> scores. Always grid search the optimal values of the most impacting
> hyperparameters. In the case of LogisticRegression you should grid
> search the regularization parameter which is named 'C'.
>
> Here is the documentation for grid search:
>
>   http://scikit-learn.org/dev/modules/grid_search.html
>
> > and these are the results:
> >>
> >>                WORD                    Scikit                Weka
> >>
> >> accommodate             0.3                   0.667
> >> bow                              0.05                 0.681818
> >> display                         0.475               0.70
> >> haunt                            0.575               0.53
> >> owe                              0.2533             0.4375
> >
> >>
> >> This are the (correct_label / total_label) scores. Except haunt, scores
> >> are not consistent and scikit's are significantly lower than Weka. I do
> not
> >> say scikit has a bug or something, most likely there is a problem in my
> code
> >> or Weka makes some pre-processing instead of using raw data directly.
> Could
> >> you explain why is there a huge differences between Scikit and Weka
> scores.
> >>
> >> Every features have sum to 1 and their values are between 0 and 1.
>
> Do you mean each feature vector sum to 1, right?
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnnow-d2d
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Osman Başkaya
Koc University
MS Student | Computer Science and Engineering
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to