Hans Brende created ANY23-353:
---------------------------------
Summary: RDFParseException: datatype rdf:langString requires a
language tag
Key: ANY23-353
URL: https://issues.apache.org/jira/browse/ANY23-353
Project: Apache Any23
Issue Type: Bug
Components: extractors
Affects Versions: 2.3
Reporter: Hans Brende
Assignee: Hans Brende
Fix For: 2.3
When extracting from [http://dbpedia.org/data/Trento.n3], I get the following
error log:
{code}
org.apache.any23.extractor.ExtractionException: Error while parsing RDF
document.
at
org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:179)
at
org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:60)
at
org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:471)
at
org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:259)
at org.apache.any23.Any23.extract(Any23.java:302)
at org.apache.any23.Any23.extract(Any23.java:437)
at org.apache.any23.Any23Test.testDemoCodeSnippet2(Any23Test.java:223)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
at
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.eclipse.rdf4j.rio.RDFParseException: datatype rdf:langString
requires a language tag [line 303]
at
org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:442)
at
org.eclipse.rdf4j.rio.helpers.RDFParserHelper.createLiteral(RDFParserHelper.java:242)
at
org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.createLiteral(AbstractRDFParser.java:571)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseQuotedLiteral(TurtleParser.java:672)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:597)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:407)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
at
org.apache.any23.extractor.rdf.RDFParserFactory$ExtendedTurtleParser.parse(RDFParserFactory.java:352)
at
org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:143)
at
org.apache.any23.extractor.rdf.RDFParserFactory$ExtendedTurtleParser.parse(RDFParserFactory.java:359)
at
org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:174)
... 31 more
Caused by: java.lang.IllegalArgumentException: datatype rdf:langString requires
a language tag
at
org.eclipse.rdf4j.model.impl.SimpleLiteral.<init>(SimpleLiteral.java:99)
at
org.eclipse.rdf4j.model.impl.AbstractValueFactory.createLiteral(AbstractValueFactory.java:118)
at
org.apache.any23.rdf.Any23ValueFactoryWrapper.createLiteral(Any23ValueFactoryWrapper.java:176)
at
org.eclipse.rdf4j.rio.helpers.RDFParserHelper.createLiteral(RDFParserHelper.java:235)
... 44 more
{code}
This is caused by the following malformed markup from dbpedia:
{code}
dbr:Trento geo:geometry "POINT(11.116666793823
46.066665649414)"^^virtrdf:Geometry ;
dbp:mayor "Alessandro Andreatta"^^rdf:langString ;
dbp:imageFlag "Flag of Trento.png"^^rdf:langString .
{code}
I propose that, for string literals of datatype "langString" with no language
tag, we should simply fall back to type "string".
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)