[jira] [Commented] (SAMOA-48) Multiple copies of evaluation statistics printed in ensemble methods

ASF GitHub Bot (JIRA) Mon, 26 Oct 2015 06:15:13 -0700

    [ 
https://issues.apache.org/jira/browse/SAMOA-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974179#comment-14974179
 ]


ASF GitHub Bot commented on SAMOA-48:
-------------------------------------

GitHub user gdfm opened a pull request:

    https://github.com/apache/incubator-samoa/pull/39

    SAMOA-48: Multiple copies of evaluation statistics printed in ensemble 
methods

    This fixes the multiple copies of the output.
    Works fine for VHT.
    For ensembles I ran into something that seems like a race condition.
    Sometimes I get n votes (where n is the ensemble size) for the final 
instance multiple times, sometimes it doesn't even get to n votes (so it does 
not print).
    This behavior is very weird because we are not using threads in local at 
all.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gdfm/incubator-samoa SAMOA-48

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-samoa/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #39
    
----
commit af25e7d1f464323dcd0e2bb5729dc70e5ad36887
Author: Gianmarco De Francisci Morales <[email protected]>
Date:   2015-10-26T12:20:55Z

    SAMOA-48: Fix for VHT

commit 23169b0605005bf9b0b5f2dacb36a16b6b5b3b5d
Author: Gianmarco De Francisci Morales <[email protected]>
Date:   2015-10-26T13:07:16Z

    SAMOA-48: Fix for ensembles (race condition?)

----


> Multiple copies of evaluation statistics printed in ensemble methods
> --------------------------------------------------------------------
>
>                 Key: SAMOA-48
>                 URL: https://issues.apache.org/jira/browse/SAMOA-48
>             Project: SAMOA
>          Issue Type: Bug
>          Components: SAMOA-API
>            Reporter: Gianmarco De Francisci Morales
>
> When running an ensemble method the evaluation statistics are reported 
> multiple times. Probably, once per classifier in the ensemble.
> We should de-duplicate the output and print the statistics only once.
> To reproduce:
> {code}
> ./bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar 
> "PrequentialEvaluation -l (classifiers.ensemble.Bagging) -s (ArffFileStream 
> -f ../covtypeNorm.arff)"
> {code}
> {code}
> ...
> 2015-10-26 10:50:45,719 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:191) 
> - total evaluation time: 12 seconds for 580000 instances
> 2015-10-26 10:50:45,719 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:183) 
> - last event is received!
> 2015-10-26 10:50:45,719 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:184) 
> - total count: 580000
> 2015-10-26 10:50:45,719 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:187) 
> - org.apache.samoa.evaluation.EvaluatorProcessorid = 0
> evaluation instances,classified instances,classifications correct 
> (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent)
> 100000.0,122089.0,68.69005397701676,22.105931593255672,-207.85213819763229
> 200000.0,244303.0,70.60330818696455,28.288283767693795,-301.3692505449057
> 300000.0,366827.0,69.57802997053106,43.57425440093415,-345.368559683921
> 400000.0,488954.0,65.99680133509491,39.744010597463216,-423.47218286577856
> 500000.0,611016.0,65.4212000995064,40.65657964083249,-463.8095746384162
> 2015-10-26 10:50:45,719 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:191) 
> - total evaluation time: 12 seconds for 580000 instances
> 2015-10-26 10:50:45,719 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:183) 
> - last event is received!
> 2015-10-26 10:50:45,720 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:184) 
> - total count: 580000
> 2015-10-26 10:50:45,720 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:187) 
> - org.apache.samoa.evaluation.EvaluatorProcessorid = 0
> evaluation instances,classified instances,classifications correct 
> (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent)
> 100000.0,122089.0,68.69005397701676,22.105931593255672,-207.85213819763229
> 200000.0,244303.0,70.60330818696455,28.288283767693795,-301.3692505449057
> 300000.0,366827.0,69.57802997053106,43.57425440093415,-345.368559683921
> 400000.0,488954.0,65.99680133509491,39.744010597463216,-423.47218286577856
> 500000.0,611016.0,65.4212000995064,40.65657964083249,-463.8095746384162
> 2015-10-26 10:50:45,720 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:191) 
> - total evaluation time: 12 seconds for 580000 instances
> 2015-10-26 10:50:45,720 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:183) 
> - last event is received!
> 2015-10-26 10:50:45,720 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:184) 
> - total count: 580000
> 2015-10-26 10:50:45,720 [main] INFO  
> org.apache.samoa.evaluation.EvaluatorProcessor (EvaluatorProcessor.java:187) 
> - org.apache.samoa.evaluation.EvaluatorProcessorid = 0
> evaluation instances,classified instances,classifications correct 
> (percent),Kappa Statistic (percent),Kappa Temporal Statistic (percent)
> 100000.0,122089.0,68.69005397701676,22.105931593255672,-207.85213819763229
> 200000.0,244303.0,70.60330818696455,28.288283767693795,-301.3692505449057
> 300000.0,366827.0,69.57802997053106,43.57425440093415,-345.368559683921
> 400000.0,488954.0,65.99680133509491,39.744010597463216,-423.47218286577856
> 500000.0,611016.0,65.4212000995064,40.65657964083249,-463.8095746384162
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SAMOA-48) Multiple copies of evaluation statistics printed in ensemble methods

Reply via email to