[jira] [Assigned] (JENA-1652) jena-text analyzer regression

Code Ferret (JIRA) Sun, 16 Dec 2018 08:02:31 -0800


     [ 
https://issues.apache.org/jira/browse/JENA-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Code Ferret reassigned JENA-1652:
---------------------------------

    Assignee: Code Ferret  (was: Osma Suominen)

> jena-text analyzer regression
> -----------------------------
>
>                 Key: JENA-1652
>                 URL: https://issues.apache.org/jira/browse/JENA-1652
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Text
>    Affects Versions: Jena 3.10.0
>         Environment: Ubuntu 16.04
> java version "1.8.0_191"
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
>            Reporter: Osma Suominen
>            Assignee: Code Ferret
>            Priority: Major
>             Fix For: Jena 3.10.0
>
>
> I noticed that Skosmos unit tests are failing when run with Fuseki 3.10 
> snapshots:
> https://github.com/NatLibFi/Skosmos/issues/828
> Digging a bit deeper, it seems that jena-text is no longer applying the 
> analyzer on query strings as it used to in 3.9.0. The most likely reason for 
> this change seems to be the Lucene upgrade (JENA-1621) which may have 
> affected how analyzers are applied.
> Here is the text analyzer configuration I'm using:
> {noformat}
> <#indexLucene> a text:TextIndexLucene ;
>     ##text:directory <file:/tmp/lucene> ;
>     text:directory "mem" ;
>     text:entityMap <#entMap> ;
>     text:storeValues true ;
>     .
> <#entMap> a text:EntityMap ;
>     text:entityField      "uri" ;
>     text:graphField       "graph" ; ## enable graph-specific indexing
>     text:defaultField     "pref" ; ## Must be defined in the text:map
>     text:uidField         "uid" ;
>     text:langField        "lang" ;
>     text:map (
>          # skos:prefLabel
>          [ text:field "pref" ;
>            text:predicate skos:prefLabel ;
>            text:analyzer [ a text:LowerCaseKeywordAnalyzer ] ]
>          # skos:altLabel
>          [ text:field "alt" ;
>            text:predicate skos:altLabel ;
>            text:analyzer [ a text:LowerCaseKeywordAnalyzer ] ]
>          # skos:hiddenLabel
>          [ text:field "hidden" ;
>            text:predicate skos:hiddenLabel ;
>            text:analyzer [ a text:LowerCaseKeywordAnalyzer ] ]
>          ) .
> {noformat}
> Here is a minimal test file that I load into the default graph:
> {noformat}
> <http://example.org/guppy> <http://www.w3.org/2004/02/skos/core#prefLabel> 
> "Guppy"@en-gb .
> {noformat}
> This is the query I'm using:
> {noformat}
> PREFIX text: <http://jena.apache.org/text#>
> SELECT * {
>   ?s text:query 'G*' .
> }
> {noformat}
> It returns one row (?s=<http://example.org/guppy>) on Fuseki 3.9.0 but 
> nothing with today's 3.10 snapshot.
> If I change the 'G*' to lowercase 'g*' then I get the expected match also 
> with the 3.10 snapshot. So the analyzer (which should lowercase everything 
> and thus the case of the query string should be irrelevant) seems not to be 
> applied for the query string.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (JENA-1652) jena-text analyzer regression

Reply via email to