[jira] [Commented] (JENA-242) LARQ scores not normalized

laotao (JIRA) Thu, 03 May 2012 18:23:18 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268016#comment-13268016
 ]


laotao commented on JENA-242:
-----------------------------

Raw Lucene scores (normalized or not) really don't reflect the absolute 
similarity between a query and the results. Maybe TF-IDF algorithm is not 
appropriate to calculate these similarities for RDF literals, because they are 
usually short, compared to the usual (web) documents. Have you considered other 
algorithms, e.g. minimal edit distance? 
                
> LARQ scores not normalized
> --------------------------
>
>                 Key: JENA-242
>                 URL: https://issues.apache.org/jira/browse/JENA-242
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: LARQ
>    Affects Versions: LARQ 1.0.0
>         Environment: Fuseki
>            Reporter: laotao
>
> In previous versions the LARQ score seemed to be normalized to range [0, 1]. 
> In LARQ 1.0.0 some scores can be higher than 1. 
> Normalized scores are needed to filter sparql results (so that only items 
> above certain quality is shown).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-242) LARQ scores not normalized

Reply via email to