[ 
https://issues.apache.org/jira/browse/LUCENE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044422#comment-14044422
 ] 

Tommaso Teofili commented on LUCENE-5699:
-----------------------------------------

I get some compile errors when trying to build the classification module (with 
'ant clean compile'):
{code}
common.compile-core:
    [mkdir] Created dir: 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/build/classification/classes/java
    [javac] Compiling 6 source files to 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/build/classification/classes/java
    [javac] 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/classification/src/java/org/apache/lucene/classification/KNearestNeighborClassifier.java:37:
 error: package org.mockito.internal.listeners does not exist
    [javac] import org.mockito.internal.listeners.CollectCreatedMocks;
    [javac]                                      ^
    [javac] 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/classification/src/java/org/apache/lucene/classification/ClassificationResult.java:24:
 warning: [rawtypes] found raw type: Comparable
    [javac] public class ClassificationResult<T> implements Comparable{
    [javac]                                                 ^
    [javac]   missing type arguments for generic class Comparable<T>
    [javac]   where T is a type-variable:
    [javac]     T extends Object declared in interface Comparable
    [javac] 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/classification/src/java/org/apache/lucene/classification/ClassificationResult.java:69:
 warning: [unchecked] unchecked cast
    [javac]             ClassificationResult<T> b = (ClassificationResult<T>) o;
    [javac]                                                                   ^
    [javac]   required: ClassificationResult<T>
    [javac]   found:    Object
    [javac]   where T is a type-variable:
    [javac]     T extends Object declared in class ClassificationResult
    [javac] 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/classification/src/java/org/apache/lucene/classification/KNearestNeighborClassifier.java:132:
 warning: [unchecked] unchecked method invocation: method sort in class 
Collections is applied to given types
    [javac]         Collections.sort(returnList);
    [javac]                         ^
    [javac]   required: List<T>
    [javac]   found: List<ClassificationResult<BytesRef>>
    [javac]   where T is a type-variable:
    [javac]     T extends Comparable<? super T> declared in method 
<T>sort(List<T>)
    [javac] 
/Users/tommaso/Documents/workspaces/lucene/trunk/lucene/classification/src/java/org/apache/lucene/classification/SimpleNaiveBayesClassifier.java:182:
 warning: [unchecked] unchecked method invocation: method sort in class 
Collections is applied to given types
    [javac]       Collections.sort(dataList);
    [javac]                       ^
    [javac]   required: List<T>
    [javac]   found: List<ClassificationResult<BytesRef>>
    [javac]   where T is a type-variable:
    [javac]     T extends Comparable<? super T> declared in method 
<T>sort(List<T>)
    [javac] 1 error
    [javac] 4 warnings
{code}

The fix for the compile error is trivial, however, apart from the strange 
import of org.mockito.internal.listeners.CollectCreatedMocks in KNN (which I 
guess is caused by some "automatic organize import" IDE kind of magic, I'm not 
sure about the suggested approach of creating multiple lists of classification 
results to manually sort and just select one out of those items, it seems a bit 
costly. Also I would like to avoid definitions of public methods if they're not 
needed (they can actually be private).

> Lucene classification score calculation normalize and return lists
> ------------------------------------------------------------------
>
>                 Key: LUCENE-5699
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5699
>             Project: Lucene - Core
>          Issue Type: Sub-task
>          Components: modules/classification
>            Reporter: Gergő Törcsvári
>            Assignee: Tommaso Teofili
>         Attachments: 06-06-5699.patch
>
>
> Now the classifiers can return only the "best matching" classes. If somebody 
> want it to use more complex tasks he need to modify these classes for get 
> second and third results too. If it is possible to return a list and it is 
> not a lot resource why we dont do that? (We iterate a list so also.)
> The Bayes classifier get too small return values, and there were a bug with 
> the zero floats. It was fixed with logarithmic. It would be nice to scale the 
> class scores sum vlue to one, and then we coud compare two documents return 
> score and relevance. (If we dont do this the wordcount in the test documents 
> affected the result score.)
> With bulletpoints:
> * In the Bayes classification normalized score values, and return with result 
> lists.
> * In the KNN classifier possibility to return a result list.
> * Make the ClassificationResult Comparable for list sorting.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to