[ 
https://issues.apache.org/jira/browse/JENA-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266215#comment-15266215
 ] 

Osma Suominen commented on JENA-1172:
-------------------------------------

It seems that jena-text did have some kind support for blank nodes, but it was 
broken by recent changes (probably the internal caching) and there were no unit 
tests involving blank nodes that would have caught this.

> blank nodes can break jena-text
> -------------------------------
>
>                 Key: JENA-1172
>                 URL: https://issues.apache.org/jira/browse/JENA-1172
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Text
>    Affects Versions: Jena 3.0.1
>            Reporter: Osma Suominen
>            Assignee: Osma Suominen
>
> Data with blank node subjects can break the jena-text index.
> For this example I use a typical jena-text configuration which indexes 
> rdfs:label. Then I add this triple:
> {noformat}
> _:b0 <http://www.w3.org/2000/01/rdf-schema#label> "blank" .
> {noformat}
> There is no error (though I remember seeing WARNINGs in other situations like 
> this) and the triple gets indexed.
> When I later execute this query:
> {noformat}
> PREFIX text: <http://jena.apache.org/text#>
> SELECT ?s { ?s text:query 'blank' }
> {noformat}
> I get this error:
> {noformat}
> 10:22:38 WARN  [5] RC = 500 : java.lang.UnsupportedOperationException: 
> 3ed87b7f14f612ef53788d889f6410d6 is not a URI node
> org.apache.jena.ext.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.UnsupportedOperationException: 3ed87b7f14f612ef53788d889f6410d6 is 
> not a URI node
>       at 
> org.apache.jena.ext.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
>       at 
> org.apache.jena.ext.com.google.common.cache.LocalCache.get(LocalCache.java:3937)
>       at 
> org.apache.jena.ext.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
>       at 
> org.apache.jena.atlas.lib.cache.CacheGuava.getOrFill(CacheGuava.java:58)
>       at org.apache.jena.query.text.TextQueryPF.query(TextQueryPF.java:291)
>       at 
> org.apache.jena.query.text.TextQueryPF.variableSubject(TextQueryPF.java:229)
>       at org.apache.jena.query.text.TextQueryPF.exec(TextQueryPF.java:198)
>       at 
> org.apache.jena.sparql.pfunction.PropertyFunctionBase$RepeatApplyIteratorPF.nextStage(PropertyFunctionBase.java:106)
> {noformat}
> Note that this happens any time the jena-text query happens to match a blank 
> node subject. So a single triple with a blank node subject can "taint" the 
> whole index. This is what happens with LCSH, which for whatever reason 
> happens to contain a few hundred blank nodes that have a skos:prefLabel 
> property (among almost 8M triples that generally use URIs for everything).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to