Sorry I forgot the mention:

Scikit's Logistic Regression is incredibly fast compared to Weka. Weka's
implementation (mostly based on this
paper<http://sci2s.ugr.es/keel/pdf/algorithm/articulo/1992-JSTOR-Cessie-Logistic_Regression.pdf>)
is slow as well as VERY memory intensive. Sometimes it wasn't enough to
allocate *3 GB* as heap size. My dataset (words in above have not more than
100 instance) is very small because I use LR word by word.

Is this the case because scikit's LR uses liblinear library?

Thank you

On Thu, Jan 24, 2013 at 5:25 PM, O. B. <thyme....@gmail.com> wrote:

> Hello all,
>
> I have some problem with my experiments. I used *Logistic Regression (LR)* to
> classify words senses. We have gold tags for (target set) each word
> instance.
>
> I did 10 fold cross validation. Some words in my dataset have more than
> two senses so I wrapped logistic regression with *OneVsRestClassifier.* The
> code is here <http://pastebin.com/c4qvB71A>. Accuracy was not impressive
> and so I suspect if there was an error in my code.  So I picked five words
> to classify using LR on Weka. I used default settings on Weka and these are
> the results:
>
>                WORD                    Scikit                Weka
>
>    - accommodate             0.3                   0.667
>    - bow                              0.05                 0.681818
>    - display                         0.475               0.70
>    - haunt                            0.575               0.53
>    - owe                              0.2533             0.4375
>
>
> This are the (correct_label / total_label) scores. Except *haunt, *scores
> are not consistent and scikit's are significantly lower than Weka. I do not
> say scikit has a bug or something, most likely there is a problem in my
> code or Weka makes some pre-processing instead of using raw data directly.
> Could you explain why is there a huge differences between Scikit and Weka
> scores.
>
> Every features have sum to 1 and their values are between 0 and 1.
>
> Attached file contains outputs of Weka if necessary.
>
> Thank you
>
>
>
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to