[
https://issues.apache.org/jira/browse/JENA-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266225#comment-15266225
]
ASF GitHub Bot commented on JENA-1172:
--------------------------------------
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/137
JENA-1172: restore support for blank nodes in jena-text
It seems that jena-text used to support blank nodes in the text index
(there is some infrastructure to support this, particularly in TextQueryFuncs)
but this was broken by JENA-999 (internal caching) or possibly even earlier.
There were no unit tests involving blank nodes that would have caught this.
This PR restores support for blank node subjects. There are a couple of
very basic unit tests that check the basic functionality.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/osma/jena jena-1172
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/jena/pull/137.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #137
----
commit b793fcc9c3ff8ab44d418f5afe456d3dafd4d1e7
Author: Osma Suominen <[email protected]>
Date: 2016-05-02T07:41:56Z
JENA-1172: failing unit test (assumes blank node support will be
implemented)
commit 01ec64590867be3ce377bdf509c2f88a2f1e70c3
Author: Osma Suominen <[email protected]>
Date: 2016-05-02T08:24:30Z
JENA-1172: restore support for blank nodes in jena-text
----
> blank nodes can break jena-text
> -------------------------------
>
> Key: JENA-1172
> URL: https://issues.apache.org/jira/browse/JENA-1172
> Project: Apache Jena
> Issue Type: Bug
> Components: Text
> Affects Versions: Jena 3.0.1
> Reporter: Osma Suominen
> Assignee: Osma Suominen
>
> Data with blank node subjects can break the jena-text index.
> For this example I use a typical jena-text configuration which indexes
> rdfs:label. Then I add this triple:
> {noformat}
> _:b0 <http://www.w3.org/2000/01/rdf-schema#label> "blank" .
> {noformat}
> There is no error (though I remember seeing WARNINGs in other situations like
> this) and the triple gets indexed.
> When I later execute this query:
> {noformat}
> PREFIX text: <http://jena.apache.org/text#>
> SELECT ?s { ?s text:query 'blank' }
> {noformat}
> I get this error:
> {noformat}
> 10:22:38 WARN [5] RC = 500 : java.lang.UnsupportedOperationException:
> 3ed87b7f14f612ef53788d889f6410d6 is not a URI node
> org.apache.jena.ext.com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.UnsupportedOperationException: 3ed87b7f14f612ef53788d889f6410d6 is
> not a URI node
> at
> org.apache.jena.ext.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
> at
> org.apache.jena.ext.com.google.common.cache.LocalCache.get(LocalCache.java:3937)
> at
> org.apache.jena.ext.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
> at
> org.apache.jena.atlas.lib.cache.CacheGuava.getOrFill(CacheGuava.java:58)
> at org.apache.jena.query.text.TextQueryPF.query(TextQueryPF.java:291)
> at
> org.apache.jena.query.text.TextQueryPF.variableSubject(TextQueryPF.java:229)
> at org.apache.jena.query.text.TextQueryPF.exec(TextQueryPF.java:198)
> at
> org.apache.jena.sparql.pfunction.PropertyFunctionBase$RepeatApplyIteratorPF.nextStage(PropertyFunctionBase.java:106)
> {noformat}
> Note that this happens any time the jena-text query happens to match a blank
> node subject. So a single triple with a blank node subject can "taint" the
> whole index. This is what happens with LCSH, which for whatever reason
> happens to contain a few hundred blank nodes that have a skos:prefLabel
> property (among almost 8M triples that generally use URIs for everything).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)