Hans Brende created ANY23-353:
---------------------------------

             Summary: RDFParseException: datatype rdf:langString requires a 
language tag
                 Key: ANY23-353
                 URL: https://issues.apache.org/jira/browse/ANY23-353
             Project: Apache Any23
          Issue Type: Bug
          Components: extractors
    Affects Versions: 2.3
            Reporter: Hans Brende
            Assignee: Hans Brende
             Fix For: 2.3


When extracting from [http://dbpedia.org/data/Trento.n3], I get the following 
error log:

{code}
org.apache.any23.extractor.ExtractionException: Error while parsing RDF 
document.

        at 
org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:179)
        at 
org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:60)
        at 
org.apache.any23.extractor.SingleDocumentExtraction.runExtractor(SingleDocumentExtraction.java:471)
        at 
org.apache.any23.extractor.SingleDocumentExtraction.run(SingleDocumentExtraction.java:259)
        at org.apache.any23.Any23.extract(Any23.java:302)
        at org.apache.any23.Any23.extract(Any23.java:437)
        at org.apache.any23.Any23Test.testDemoCodeSnippet2(Any23Test.java:223)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
        at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
        at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
        at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
        at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.eclipse.rdf4j.rio.RDFParseException: datatype rdf:langString 
requires a language tag [line 303]
        at 
org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:442)
        at 
org.eclipse.rdf4j.rio.helpers.RDFParserHelper.createLiteral(RDFParserHelper.java:242)
        at 
org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.createLiteral(AbstractRDFParser.java:571)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseQuotedLiteral(TurtleParser.java:672)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:597)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:474)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:412)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:407)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:372)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:239)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:201)
        at 
org.apache.any23.extractor.rdf.RDFParserFactory$ExtendedTurtleParser.parse(RDFParserFactory.java:352)
        at 
org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:143)
        at 
org.apache.any23.extractor.rdf.RDFParserFactory$ExtendedTurtleParser.parse(RDFParserFactory.java:359)
        at 
org.apache.any23.extractor.rdf.BaseRDFExtractor.run(BaseRDFExtractor.java:174)
        ... 31 more
Caused by: java.lang.IllegalArgumentException: datatype rdf:langString requires 
a language tag
        at 
org.eclipse.rdf4j.model.impl.SimpleLiteral.<init>(SimpleLiteral.java:99)
        at 
org.eclipse.rdf4j.model.impl.AbstractValueFactory.createLiteral(AbstractValueFactory.java:118)
        at 
org.apache.any23.rdf.Any23ValueFactoryWrapper.createLiteral(Any23ValueFactoryWrapper.java:176)
        at 
org.eclipse.rdf4j.rio.helpers.RDFParserHelper.createLiteral(RDFParserHelper.java:235)
        ... 44 more

{code}

This is caused by the following malformed markup from dbpedia:

{code}
dbr:Trento      geo:geometry    "POINT(11.116666793823 
46.066665649414)"^^virtrdf:Geometry ;
        dbp:mayor       "Alessandro Andreatta"^^rdf:langString ;
        dbp:imageFlag   "Flag of Trento.png"^^rdf:langString .
{code}

I propose that, for string literals of datatype "langString" with no language 
tag, we should simply fall back to type "string".





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to